Apex Systems the 2nd largest IT Staffing firm in the nation is seeking an experienced DevOps Engineer to join our client’s team. This is a W2 contract position is slated for 6 months with possibility for extension/conversion and is FULLY REMOTE (PST hours).
• *Must be comfortable sitting on Apex System's W2**
If you are interested send all qualified resumes to Nathan Castillo (Professional Recruiter with Apex Systems) at Ncastillo@apexsystems.com!
We are on a mission to... connect every member of the global workforce with economic opportunity, and that starts right here. Talent is our number one priority, and we make sure to apply that philosophy both to our customers and to our own employees as well. Explore cutting-edge technology and flex your creativity. Work and learn from the best. Push your skills higher. Tackle big problems. Innovate. Create. Write code that makes a difference in professionals’ lives.
Gobblin is a distributed data integration framework that was born at client and was later released as an open-source project under the Apache foundation. Gobblin is a critical component in client's data ecosystem, and is the main bridge between the different data platforms, allowing efficient data movement between our AI, analytics, and member-facing services. Gobblin utilizes and integrates with the latest open source big data technologies, including Hadoop, Spark, Presto, Iceberg, Pinot, ORC, Avro, and Kubernetes. Gobblin is a key piece in client's data lake, operating at a massive scale of hundreds of petabytes.
Our latest work involves integrations with cutting edge technologies such as Apache Iceberg to allow near-real-time ingestion of data from various sources onto our persistent datasets that allow complex and highly scalable query processing for various business logic applications, serving machine-learning and data-science engineers. Furthermore, we play an instrumental role in client's transformation from on-prem oriented deployment to Azure cloud-based environments. This transformation prompted a massive modernization and rebuilding efforts of Gobblin, transforming it from a managed set of Hadoop batch jobs to an agile, auto-scalable, real-time streaming oriented PaaS, with user-friendly self-management capabilities that will boost productivity across our customers. This is an exciting opportunity to take part in shaping the next generation of the platform.
What is the Job
You will be working closely with development and site reliability teams to better understand their challenges in aspects like:
Increasing development velocity of data management pipelines by automating testing and deployment processes,
Improving the quality of data management software without compromising agility.
You will create and maintain fully-automated CI/CD processes across multiple environments and make them reproducible, measurable, and controllable for data pipelines that deal with PBs every day. With your abundant skills as a DevOps engineer, you will also be able to influence the broad teams and cultivate DevOps culture across the organization.
Why it matters
CI/CD for big data management pipelines have been a traditional challenge for the industry. This is becoming more critical as we evolve our tech stack into the cloud age (Azure). With infrastructure shifts and data lake features being developed/deployed at an ever fast pace, our integration and deployment processes must evolve to ensure the highest-quality and fulfill customer commitments. The reliability of our software greatly influences the analytical workload and decision-making processes across many company-wide business units, the velocity of our delivery plays a critical role to transform the process of mining insights from massive-scale Data Lake into an easier and more efficient developer productivity paradigm.
What You’ll Be Doing
• Work collaboratively in an agile, CI/CD environment
• Analyze, document, and implement and maintain CI/CD pipelines/workflows in cooperation with the data lake development and SRE teams
• Build, improve, and maintain CI/CD tooling for data management pipelines
• Identify areas for improvement for the development processes in data management teams
• Evangelize CI/CD best practices and principles
• Experienced in building and maintaining successful CI/CD pipelines
• Self-driven and independent
• Has experience with Java, Scala, Python or other programming language
• Great communication skills
• Master of automation
Years of Experience
• Proficient in Java/Scala
• Proficient in Python
• Experienced in working with:
• Big Data environments: Hadoop, Kafka, Hive, Yarn, HDFS, K8S
• ETL pipelines and distributed systems
Apex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law. If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation in using our website for a search or application, please contact our Employee Services Department at firstname.lastname@example.org or 844-463-6178Read more