Hadoop Developer

Scientific Systems and Software International Corp...

Provide technical and development support to the government client to build and maintain a modernized Enterprise Data Warehouse (EDW) by expanding the current on-premises Hadoop cluster to accommodate an increased volume of data flowing into the enterprise data warehouse

  • Perform data formatting involves cleaning up the data.
  • Assign schemas and create HIVE tables
  • Apply other HDFS formats and structure (Avro, Parquet, etc. ) to support fast retrieval of data, user analytics and analysis
  • Assess the suitability and quality of candidate data sets for the Data Lake and the EDW
  • Design and prepare technical specifications and guidelines.
  • Act as self-starter with the ability to take on complex projects and analyses independently
  • Ensure secure coding practices are adhered to in all phases of the secure development lifecycle. Be knowledgeable in all NGC SSA Programs HIPAA compliance requirements and proactively address any HIPAA concerns. Become knowledgeable on the HIPAA policies and procedures for the program and ensure awareness of HIPAA breach process
  • Description of Work:

    • Provide technical and development support to the government client to build and maintain a modernized Enterprise Data Warehouse (EDW) by expanding the current on-premises Hadoop cluster to accommodate an increased volume of data flowing into the enterprise data warehouse
    • Perform data formatting involves cleaning up the data.
    • Assign schemas and create HIVE tables
    • Apply other HDFS formats and structure (Avro, Parquet, etc. ) to support fast retrieval of data, user analytics and analysis
    • Assess the suitability and quality of candidate data sets for the Data Lake and the EDW
    • Design and prepare technical specifications and guidelines.
    • Act as self-starter with the ability to take on complex projects and analyses independently
    • Ensure secure coding practices are adhered to in all phases of the secure development lifecycle. Be knowledgeable in all NGC SSA Programs HIPAA compliance requirements and proactively address any HIPAA concerns. Become knowledgeable on the HIPAA policies and procedures for the program and ensure awareness of HIPAA breach process

    Basic Qualifications:

    Minimum knowledge, skills, abilities.

    • Must be a US Citizen or Green Card Holder.
    • Master’s degree and 12 years of IT experience, Bachelor's degree in computer science, or related degree and 14 years of related work experience, or 19 years of related work experience to satisfy the degree requirement
    • 4+ years of proven experience in a range of big data architectures and frameworks including Hadoop ecosystem, Java MapReduce, Pig, Hive, Spark, Impala etc..
    • 5 years of proven experience working with, processing and managing large data sets (multi TB scale).
    • Proven experience in ETL (Syncsort DMX-h, Ab Initio, IBM – InfoSphere Data Replication, etc.), mainframe skills, JCL
    • Applicants selected will be subject to a government security investigation and must meet eligibility requirements and be able to obtain Public Trust level 5 security clearance. Individuals may also be subject to a background investigation to include but not limited to criminal history, employment and education verification, drug testing, and creditworthiness.

    Preferred Qualifications:

    Candidates with these skills will be given preferential consideration.

    • Experience with Apache Hadoop Administration (Preferred Cloudera Framework).
    • Experience with Linux Administration (Centos and Red Hat).
    • Experience in coding shell scripting in Python.
    • Experience in Big Data Storage and File System Design.
    • Experience in performing troubleshooting procedures and designing resolution scripts
    • Experience with Mainframe development including Java Batch development, Architectural Design/Analysis and Database development.
    • Experience in analytic programming, data discovery, querying databases/data warehouses and data analysis.
    • Experience in data ingestion technologies with reference to relational databases (i.e. DB2, Oracle, SQL)
    • Experience with advanced SQL query writing and data retrieval using “Big Data” Analytics.
    • Experience with enterprise scale application architectures.
    • Proven ability to work with senior technical managers and staff to provide expert-level support for the installation, maintenance, upgrading, and administration of full-featured database management systems

    Other:

    • Knowledge and experience with application integration, thorough understanding of complex network topologies, solid understanding of system security & risk management
    • Knowledge of Informatica, Syncsort DMX-h, Ab Initio
    • Knowledge of a variety of technical tools, methodologies, and architectures as used in the SSA computing environment
    • Knowledge of leading-edge technologies, new methodologies, and best practices applicable to work performed.
    • Knowledge of SSA’s Enterprise Architecture

    To apply for this job please visit tinyurl.com.