Skip to Main Content

Jobs

Exciting employment opportunities within the Tenaya Capital portfolio.

Senior Data Engineer at ResearchGATE
Berlin, DE
The web was created by scientists and for scientists, to foster scientific collaboration and drive progress for a better world. Join our team to take the web back to its roots and achieve that original mission.
We’re a passionate team of pragmatic optimists from around the world and from many different backgrounds. Together, we focus on building great products that change the way scientists communicate for the better.
 
We love what we do. We connect the world of science and make research open to all.
 
Objective of the Role
 
As part of ResearchGate’s data engineering teams, you empower decision making processes for product managers and data scientists by continuously improving data pipelines and architecture. To ensure fast and reliable data access, shaping and working on a long-term vision for our data and machine learning infrastructure is important. Join us to empower ResearchGate and make science happen faster.

Responsibilities

  • Become an essential member of our Machine Learning Infrastructure Architecture Team and shape the long-term vision of ML at ResearchGate
  • Develop a system that enables data teams to quickly iterate on ML-based workloads and easily deploy their models to our production systems
  • Ensure that the data pipelines we use at ResearchGate are ready for future challenges
  • Provide technical leadership, influence, and partner with fellow engineers to architect, design and build infrastructure that withstands scale and availability while reducing operational overhead
  • Engineer efficient, adaptable and scalable data architectures to make building and maintaining big data applications easy and enjoyable for others
  • Build fault tolerant, self-healing, adaptive, and highly accurate data computational pipelines
  • Work with data scientists, data analysts, backend engineers, and product managers to solve problems, identify trends and leverage the data we produce
  • Build workflows involving large datasets and/or machine learning models in production using distributed computing and big data processing concepts and technologies

Requirements

  • Experience in designing and implementing data pipelines and ML applications
  • Working with data at the petabyte scale
  • Design and operation of robust distributed systems
  • Experience with Java is preferred
  • Working knowledge of relational databases and query authoring (SQL)
  • Experience using technologies like Kafka, Hadoop, Hive, and Flink
  • Experience in using machine learning tools/frameworks/libraries, such as Python, R, Jupyter Notebook, scikit-learn, PyTorch, Tensorflow is a plus