Robotics Data Engineer

리얼월드

331

상시채용

5월 16일 게시

경력

신입~10년차

근무지역

기타

학력

학력 무관

근무형태

정규직

직군

데이터 엔지니어, 데이터 분석가, 소프트웨어 엔지니어, QA·테스트

<div style="font-family: sans-serif; line-height: 1.6; padding: 20px;"><div style="margin-bottom: 40px;"><h2 style="font-size: 22px; margin-bottom: 16px; font-weight: bold; color: #333;">주요업무</h2><div style="color: #333;"><p>High-performance robotics AI models require not just scale, but high-quality and richly structured training data. This position plays a foundational role in <strong>designing and building optimized data pipelines</strong> that aggregate and process diverse robotic sensor inputs—such as cameras, tactile sensors, and IMUs—alongside relevant metadata. Your work will form the core infrastructure that supports Robotics AI development at scale.</p><p><strong>We welcome data-driven engineers eager to build high-quality data infrastructure that powers the next generation of Robotics AI!</strong></p><p><br/></p><ul><li class="ql-indent-1"><strong>Data Strategy and Architecture Design</strong>Develop strategies for structuring and managing large-scale datasets used in Robotics AI training.</li><li class="ql-indent-1">Design systems for large-scale data processing using distributed storage, databases, and caching.</li><li class="ql-indent-1"><strong>Automated Data Collection and Preprocessing</strong>Build pipelines for efficient ingestion and organization of raw sensor data from robots.</li><li class="ql-indent-1">Automate large-scale preprocessing tasks such as noise filtering, synchronization, and format conversion.</li><li class="ql-indent-1"><strong>Data Labeling and Quality Management</strong>Design labeling workflows for tasks such as object detection, localization, and path planning.</li><li class="ql-indent-1">Integrate tools like CVAT or Labelbox and implement label verification and quality assurance processes.</li><li class="ql-indent-1"><strong>Data Pipeline Optimization and Operation</strong>Build robust, scalable data pipelines compatible with CI/CD and MLOps environments.</li><li class="ql-indent-1">Analyze and optimize bottlenecks in training/validation stages and ensure system scalability.</li><li class="ql-indent-1"><strong>Data Versioning and Metadata Management</strong>Implement workflows and tools for managing dataset versions systematically.</li><li class="ql-indent-1">Design metadata structures to ensure lifecycle tracking and full traceability of datasets.</li><li class="ql-indent-1"><strong>Cross-Team Collaboration and Monitoring</strong>Collaborate closely with research and modeling teams to understand and address data requirements.</li><li class="ql-indent-1">Monitor data quality and resolve operational issues promptly.</li></ul></div></div><div style="margin-bottom: 40px;"><h2 style="font-size: 22px; margin-bottom: 16px; font-weight: bold; color: #333;">자격요건</h2><div style="color: #333;"><ul><li class="ql-indent-1"><strong>Experience in Data Engineering and Infrastructure</strong>Proficiency in Python, SQL, and other data processing tools.</li><li class="ql-indent-1">Hands-on experience managing large-scale databases and distributed file systems (e.g., HDFS, AWS S3).</li><li class="ql-indent-1"><strong>Understanding of Robotic Sensor Data Processing</strong>Familiarity with data from RGB/Depth cameras, LiDAR, IMUs, and associated preprocessing techniques.</li><li class="ql-indent-1">Understanding of ROS data formats (e.g., Rosbag) or similar robotic platforms.</li><li class="ql-indent-1"><strong>Data Pipeline Automation Skills</strong>Experience using workflow tools such as Airflow or Luigi, and building CI/CD data pipelines.</li><li class="ql-indent-1">Hands-on experience managing large-scale ETL (Extract, Transform, Load) processes.</li><li class="ql-indent-1"><strong>Software Development and Collaboration Skills</strong>Experience with version control tools like Git and working in collaborative engineering environments.</li><li class="ql-indent-1">Adherence to software best practices such as code reviews, testing, and documentation.</li></ul></div></div><div style="margin-bottom: 40px;"><h2 style="font-size: 22px; margin-bottom: 16px; font-weight: bold; color: #333;">우대사항</h2><div style="color: #333;"><ul><li class="ql-indent-1"><strong>Experience with Cloud-Based Data Infrastructure</strong>Experience designing and operating pipelines in AWS, GCP, or Azure environments.</li><li class="ql-indent-1">Familiarity with serverless architecture or container orchestration tools (e.g., Kubernetes).</li><li class="ql-indent-1"><strong>Understanding of ML/DL Workflows and MLOps</strong>Familiarity with data needs in machine learning/deep learning pipelines.</li><li class="ql-indent-1">Experience building pipelines for model serving, monitoring, and automated retraining.</li><li class="ql-indent-1"><strong>Proficiency in Labeling Tools and Auto-Labeling Techniques</strong>Experience using OpenCV, PyTorch, or TensorFlow for automated labeling (e.g., segmentation, keypoint detection).</li><li class="ql-indent-1">Familiarity with active learning methods to optimize labeling efficiency.</li><li class="ql-indent-1"><strong>Experience with Large-Scale Operations and Incident Response</strong>Experience processing PB-scale datasets.</li><li class="ql-indent-1">Ability to monitor and resolve issues in large-scale distributed systems (e.g., network, storage failures).</li></ul></div></div></div>

주요업무

High-performance robotics AI models require not just scale, but high-quality and richly structured training data. This position plays a foundational role in designing and building optimized data pipelines that aggregate and process diverse robotic sensor inputs—such as cameras, tactile sensors, and IMUs—alongside relevant metadata. Your work will form the core infrastructure that supports Robotics AI development at scale.

We welcome data-driven engineers eager to build high-quality data infrastructure that powers the next generation of Robotics AI!

Data Strategy and Architecture DesignDevelop strategies for structuring and managing large-scale datasets used in Robotics AI training.
Design systems for large-scale data processing using distributed storage, databases, and caching.
Automated Data Collection and PreprocessingBuild pipelines for efficient ingestion and organization of raw sensor data from robots.
Automate large-scale preprocessing tasks such as noise filtering, synchronization, and format conversion.
Data Labeling and Quality ManagementDesign labeling workflows for tasks such as object detection, localization, and path planning.
Integrate tools like CVAT or Labelbox and implement label verification and quality assurance processes.
Data Pipeline Optimization and OperationBuild robust, scalable data pipelines compatible with CI/CD and MLOps environments.
Analyze and optimize bottlenecks in training/validation stages and ensure system scalability.
Data Versioning and Metadata ManagementImplement workflows and tools for managing dataset versions systematically.
Design metadata structures to ensure lifecycle tracking and full traceability of datasets.
Cross-Team Collaboration and MonitoringCollaborate closely with research and modeling teams to understand and address data requirements.
Monitor data quality and resolve operational issues promptly.

자격요건

Experience in Data Engineering and InfrastructureProficiency in Python, SQL, and other data processing tools.
Hands-on experience managing large-scale databases and distributed file systems (e.g., HDFS, AWS S3).
Understanding of Robotic Sensor Data ProcessingFamiliarity with data from RGB/Depth cameras, LiDAR, IMUs, and associated preprocessing techniques.
Understanding of ROS data formats (e.g., Rosbag) or similar robotic platforms.
Data Pipeline Automation SkillsExperience using workflow tools such as Airflow or Luigi, and building CI/CD data pipelines.
Hands-on experience managing large-scale ETL (Extract, Transform, Load) processes.
Software Development and Collaboration SkillsExperience with version control tools like Git and working in collaborative engineering environments.
Adherence to software best practices such as code reviews, testing, and documentation.

우대사항

Experience with Cloud-Based Data InfrastructureExperience designing and operating pipelines in AWS, GCP, or Azure environments.
Familiarity with serverless architecture or container orchestration tools (e.g., Kubernetes).
Understanding of ML/DL Workflows and MLOpsFamiliarity with data needs in machine learning/deep learning pipelines.
Experience building pipelines for model serving, monitoring, and automated retraining.
Proficiency in Labeling Tools and Auto-Labeling TechniquesExperience using OpenCV, PyTorch, or TensorFlow for automated labeling (e.g., segmentation, keypoint detection).
Familiarity with active learning methods to optimize labeling efficiency.
Experience with Large-Scale Operations and Incident ResponseExperience processing PB-scale datasets.
Ability to monitor and resolve issues in large-scale distributed systems (e.g., network, storage failures).