Software Engineer (Data Infrastructure, Remote)

Aarhus, Central Denmark
Posted 6 days, 7 hours ago
Software Development

About the role

Job summary

This role focuses on data collection to support model training operations within an AI team. The engineer will work on building high-quality datasets at a large scale through a combination of infrastructure, engineering, and research efforts.

Qualifications

  • BS/MS/PhD in Computer Science or a related discipline.
  • Over 5 years of experience in software development.
  • Proficient in bash/Python scripting within Linux environments.
  • Experienced with Docker and Infrastructure-as-Code, particularly with a major Cloud Provider (GCP preferred).
  • Familiarity with web crawlers and large-scale data processing workflows is advantageous.
  • Strong multitasking abilities and adaptability to changing priorities.
  • Excellent written and verbal communication skills.

Responsibilities

  • Identify and source new audio data for the ingestion pipeline.
  • Manage and enhance the cloud infrastructure for data ingestion, utilizing GCP and Terraform.
  • Collaborate with scientists to optimize cost, throughput, and quality of data for model training.
  • Work with the AI team and leadership to develop a dataset roadmap for future products.

Skills

  • Proficiency in cloud infrastructure management and data processing.
  • Strong analytical and problem-solving skills.

Education

  • Relevant degree in Computer Science or a related field.

Tools

  • GCP, Terraform, Docker, bash, Python.
Full Access

Ready to apply for this role?

Full Access gives you the company name, full job description, and a direct link to apply. The summary above helps you explore the role.

Share this job