Principal Consultant (Lead Databricks Developer)

  • Bengaluru
  • Genpact
Genpact (NYSE: G) is a global professional services and solutions firm delivering outcomes that shape the future. Our 125,000+ people across 30+ countries are driven by our innate curiosity, entrepreneurial agility, and desire to create lasting value for clients. Powered by our purpose – the relentless pursuit of a world that works better for people – we serve and transform leading enterprises, including the Fortune Global 500, with our deep business and industry knowledge, digital operations services, and expertise in data, technology, and AI.

Transformation happens here. Come, be a part of our exciting journey!

Responsibilities (The primary tasks, functions and deliverables of the role) Design and build reusable components, frameworks and libraries at scale to support analytics products Design and implement product features in collaboration with business and Technology stakeholders Identify and solve issues concerning data management to improve data quality Clean, prepare and optimize data for ingestion and consumption Collaborate on the implementation of new data management projects and re-structure of the current data architecture Implement automated workflows and routines using workflow scheduling tools Build continuous integration, test-driven development and production deployment frameworks Analyze and profile data for designing scalable solutions Troubleshoot data issues and perform root cause analysis to proactively resolve product and operational issues

Minimum Qualifications Experience : Strong understanding of data structures and algorithms Strong understanding of solution and technical design Has a strong problem solving and analytical mindset. Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders Able to quickly pick up new programming languages, technologies, and frameworks Experience building cloud scalable, real time and high-performance data lake solutions Fair understanding of developing complex data solutions Experience working on end-to-end solution design Willing to learn new skills and technologies Has a passion for data solutions

Required skill Hands on experience in Databricks and AWS - EMR [Hive, Pyspark], S3, Athena. Familiarity with Spark Structured Streaming experience working experience with Hadoop stack dealing huge volumes of data in a scalable fashion hands-on experience with SQL, ETL, data transformation and analytics functions hands-on Python experience including Batch scripting, data manipulation, distributable packages experience working with batch orchestration tools such as Apache Airflow or equivalent, preferable Airflow working with code versioning tools such as GitHub or BitBucket; expert level understanding of repo design and best practices Familiarity with deployment automation tools such as Jenkins hands-on experience designing and building ETL pipelines; expert with data ingest, change data capture, data quality; hand on experience with API development designing and developing relational database objects; knowledgeable on logical and physical data modelling concepts; some experience with Snowflake Familiarity with Tableau or Cognos use cases

Preferred Qualifications Familiarity with Agile; working experience preferred