ANSA McAL Limited is seeking a suitably qualified candidate for the position of Data Engineer.
ANSA McAL Limited is seeking an individual with strong technical proficiency for designing, building, and maintaining the infrastructure and systems required for collecting, storing, and analyzing data efficiently. The ideal candidate will possess the skills necessary to enable the organization to leverage data for decision making and strategic planning. Effective communication skills are essential for articulating findings and supporting data-driven decision-making across the organization. Additionally, the candidate should demonstrate strong time management skills and the ability to collaborate effectively with stakeholders across various departments.
Job Summary
We are seeking an individual with strong technical proficiency for designing, building and maintaining the infrastructure and systems required for collecting, storing and analyzing data efficiently.
Effective communication skills are essential for articulating findings and supporting data-driven decision-making across the organization. Additionally, the candidate should demonstrate strong time management skills and the ability to collaborate effectively with stakeholders across various departments.
Key Responsibilities
Database Management
- Create and maintain databases and data warehouses (e.g., relational databases, data lakes).
- Optimize storage solutions for performance and cost.
- Integrate data from multiple sources, such as APls, sensors, and legacy systems.
- Ensure data compatibility and proper schema design.
- Manage scalable data pipelines for processing large datasets.
- Ensure data is clean, consistent, and accessible to analysts and data scientists.
Collection and Management:
- Gather data from various sources, including databases, spreadsheets, APls, and external sources.
- Clean and preprocess data, ensuring data accuracy and completeness.
- Maintain data integrity through quality checks, data validation, and proper data storage.
- Collaboration and Cross-Functional Support:
- Work closely with IT and others to optimize reporting mechanisms.
- Work with stakeholders (e.g., data scientists, business analysts, software developers) to understand data requirements.
- Coordinate with cloud architects and system administrators for resource allocation.
Qualifications & Experience
The following are qualifications and experience that are considered essential for the job:
- Minimum of Bachelor’s Degree in Data Engineering, Computer Science or related field
- 5+ years of experience building and maintaining data pipelines, with exposure to cloud technologies being preferred
- 5+ years experience python programming
- 2+ years experience pyspark programming
- 5+ years experience SQL programming
- 5+ Database design and architecture
- 5+ years experience test-driven development
- 5+ years experience maintaining git repositories for development team
- 3+ years experience with Docker and Kubernetes
- Knowledge of data warehousing and database management
- Architectural background that fosters systematic thinking and problem-solving abilities
- Strong communication and collaboration skills are crucial, as the role involves working closely with cross functional teams, including data scientists, analysts, and business stakeholders, to understand requirements and deliver scalable solutions.
- Keeps abreast with industry trends
While not essential, we are particularly interested in candidates with the following qualifications:
- Cloud certification from Microsoft, AWS or Google
- Experience in designing and maintaining data lakes
Knowledge & Skills
The Data Engineer is responsible for constructing and maintaining the frameworks and systems that enable seamless data flow within the organization. This involves designing scalable and robust data pipelines to collect, transform and load data from diverse sources into centralized repositories such as data warehouses or lakes. They ensure that the data is consistently clean, accurate and accessible to business stakeholders and data teams. It is therefore essential that the Data Engineer should be able to demonstrate advanced level proficiency in applying concepts and tools used in:
- 3+ years’ experience with Big Data and Cloud services such as Azure, Azure Web Services, Google Cloud Platform,
- Programming with any of SQL, Python, Scala
- ETL and data pipelines such as SSIS, Apache Airflow or PowerShell
- Data integration with Azure architecture, handling big data
- Data Lake/ Data warehouse/ Lakehouse
- SQL and NoSQL databases
- Data Cleaning and Modeling e.g. with SSMS
- Data Visualization e.g. using Power BI, Tableau or Qlik
- Knowledge of security measures needed to protect sensitive data and compliance standards like GDPR and HIPAA
- Highly analytical and process oriented
- Excellent oral and written communication skills with great documentation skills
- A passion for data with a collaborative mindset with a keen sense of customer service and good business acumen
While not essential, we are particularly interested in hearing from candidates who have experience in any of the following:
- Familiar with/ knowledge on distributed computing frameworks such as Apache Spark
- Familiar with/ knowledge on stream processing frameworks for application using real-time data
- Familiar with or worked with Project Management methodologies, ideally with those incorporating Agile principles
- Understanding of the legal and regulatory requirements for data protection across the different data jurisdictions that ANSA McAL operates, to ensure that data is adequately protected in transit and at rest
- At least three years’ experience working within a sector that ANSA McAL operates e.g. construction, manufacturing, beverage, banking, insurance etc.