The Institute of Solar Research develops innovative technologies for the utilisation of solar energy. The focus is on electricity generation and the provision of heat and fuels. The primary goal is to use solar energy to contribute to the heat transition and a reduction in fossil fuels.
What to expect
Our network of sky cameras generates multiple gigabytes of image data every day, whose management, preprocessing, and effective use place high demands on scalable storage, efficient data access, robust metadata management, and reliable quality control. Addressing these challenges is essential to ensure data integrity, enable reproducible research, and support downstream analyses and machine learning workflows in operational energy meteorology. You work at the interface of data engineering and scientific data management, contributing to scalable data ingestion, structured metadata management, automated processing pipelines, and the integration of heterogeneous observational data, with the goal of keeping complex, data-intensive datasets accessible, well documented, and usable for advanced analysis.
You will be part of a diverse and motivated team working on energy-transition topics and contributing to climate protection. Close collaboration with supervisors and colleagues will support you in exchanging ideas and solving challenges. You will gain hands-on experience in software development, automated testing, version control and modern image-processing technologies. A particular highlight of the project is the opportunity to work in Almería, Spain, one of the sunniest locations in Europe.
Your tasks
- supporting data ingestion, structuring and organization of large-scale image data to enable efficient access across the internal network
- assisting in the development of data fusion pipelines that combine sky imagery with complementary measurements, such as in-situ sensor data, satellite observations and radar products
- designing and maintaining metadata schemas for cameras, sensors and observations, as well as supporting consistent metadata management across datasets
- documenting data provenance, including preprocessing steps, dataset versioning and dependencies between raw and derived data products
- contributing to the automation of data processing pipelines and workflows
- assisting with the implementation of data management practices aligned with FAIR (Findable, Accessible, Interoperable, Reusable) principles
- contributing to tools for data monitoring, exploratory analysis, visualization and quality control
Your profile
- You have a strong academic record in a master's program in computer science, physics, mathematics engineering or a related field.
- experience in Python and basic knowledge about data handling
- interest in large-scale data management, scientific imaging or computer vision
- the ability to work independently and collaborate in an international team
- prior experience in data analysis, computer vision and git versioning systems
- confident in speaking and writing English
We look forward to getting to know you!
If this sounds like an exciting opportunity for you, please apply by sending us a cover letter and your CV!
If you have any questions about this position (Vacancy-ID 4087) please contact:
Stefan Wilbert
Tel.: +49 2203 601 4619