The Big Data architect is a systems architect with extensive experience in big data technologies. This person will oversee the planning and designing of new platform solutions for the extraction of big data, transformation, data qualification, processing, storage, manipulation, and migration of such data so it becomes ready and available to applications external to the data lake and data science tools. This person will be working on and designing the data lake and the data platform with all the components around it related to the data lifecycle, including Enterprise Service Bus, Mapping, and Reporting Engines, designing of APIs, etc. Must have extensive knowledge on data warehousing of all levels, from the smaller dimensional model to a big data lake or data ocean, but especially using big data tools such as Kafka, Hadoop (HDFS), Java, Scala, Linux, Spark, Azure Cloud, etc.
- Must be proactive and technology passionate and have great communication skills to be able to discuss designs with the technical teams and transmits ideas in a professional environment.
- Must be self-managed and team-oriented, must be able to coordinate his/her own time as well as plan ahead of tasks with other technical teams, and must be willing to seek for technical problems that the teams may be experience and proactive in looking for solutions.
- At least 3 full years of experience with programming languages: Java/Scala, Python, Go, NodeJS.
- At least 2 years’ experience working as a systems architect designing and planning distributed/scalable applications and micro-services.
- At least 1 full year of experience working and designing with big data technologies such as Spark, Hadoop (HDFS), Kafka, and data lake concepts: catalog and optimization layers, presentation, and abstraction.
- Experience working with SQL and NoSQL databases such as MSSQL, Oracle, MongoDB, Cassandra, Druid, ElasticSearch, PostgreSQL, etc.
- Experience working and designing cloud solutions especially Azure Cloud.
- Previous experience with Linux operating system: bash scripting, console commands, ext filesystems, cron scheduler, services administering and implementation, etc.
- Strong experience with code versioning tools: Git and GitHub.
- Experience working in Agile under Scrum methodology, with industry tools for project planning like Jira, Confluence, LucidChart.
- Excellent written and spoken English skills (> 90%).
- The big data architect will design or help plan with the construction of data platform components, related to the importation of data from source systems into the lake, and all the ETL phases involved, to the extraction of such data from the lake to make it available to external applications and BI/Data Science tools.
- Must create all proper documentation related to those designs such as systems diagrams, process flow diagrams, component descriptions, API references, programming, and industry standards for coupling distributed components.
- Must also be a technical leader for the development team and have the capacity to help them with any implementation questions they may have, as well as help with programming issues when needed. Must have enough programming experience to be able to catch possible programming or design bugs before they are implemented during code reviews and provide guidance to fix them accordingly.
- Must have coordination abilities in order to plan with multiple internal, external, and international teams for any development efforts will act as a technical Product Owner working together with the Scrum Master in order to write all the technical user stories required for the development teams to build the components that were designed and must have good technical project management skills in order to ensure those components are prioritized and constructed on time according to the roadmap of the data platform.
- Will be doing constant research looking for new big data technologies and or innovative solutions to the different challenges that are facing the data platform implementation, need to stay up to date with industry trends, and be prepared to create new documentation, training, or presentation materials to inform the rest of the teams about those new things that can be done or introduced to further improve and renovate the data solutions and successfully apply them in new component designs.
- Must be able to translate BI/Data Science projects and POCs to complete engineering/applications and solutions, following the company standards for process industrialization considering all security, optimization, and performance aspects, and help technical teams understand those requirements to implement flexible and scalable code.