Data Engineer

Spektrum

The Hague, Netherlands

3 days ago

Role details

Contract type

Contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Job location

The Hague, Netherlands

Tech stack

Microsoft Word

API

Artificial Intelligence

Data analysis

Backup Devices

Big Data

Databases

JSON

Python

KNIME

Power BI

SQL Databases

Visual Studio Online

XML

Data Processing

Scripting (Bash/Python/Go/Ruby)

High Performance Computing

Large Language Models

Jupyter

Gitlab

Core Data

Kubernetes

Information Technology

Data Analytics

Search Engines

Data Pipelines

Docker

Job description

The NATO Information and Communication Agency (NCIA) located in The Hague, Netherlands, is currently involved in processing vast amounts and highly variant data coming from theatre for the purpose of efficient archiving. In light of these activities, within NCIA Chief Technology Office, the Exploiting Data Science and Artificial Intelligence (EDS&AI) team is tasked to apply Big Data and AI technology to prepare, run and adjust processing pipelines for processing various source data into archiving formats and metadata, and prepare for (semantic) search. NATO has an obligation to support national investigations into situation that occurred in theatre. In order to support the different teams involved most optimal, the EDS&AI team brings the expertise to extract and exploit the vast and varied data on the table, by using the Agency's high performance computing classified sandbox. The EDS&AI team provides the core data science skills and technology needed for big data analysis and AI. The EDS&AI team applies innovative technology to data whenever it is not possible to extract value with conventional approaches.

Role Duties and Responsibilities

Setting up / improving pipelines to process all required documents and that uniquely identifies and traces decisions and processing steps. This is to be conducted on the provided classified sandbox environment, with provided performance hardware and toolsets.
Implementing / improving (missing) pipeline steps for marking duplicate files, based on file attributes, path (structure) and content (similarity), and rules for considering a file or structure a duplicate.
Extracting document-format records from Functional Area Systems (FAS) databases and back-ups performed otherwise. Archiving SME's and system SME's are available for guidance on target formats and source system structure and data interpretation. Each FAS is processed separately.
Processing / Monitoring progress of various office, image and video file types to the accepted archiving formats, including extraction of metadata and preparing search semantic indexes.
Automating registering all processed documents with semantic indexes with the sandbox natural language search tool.
Automating the final copy of all non-duplicate and extracted archive documents with content and metadata to the NATO archiving system.
Reporting status, progress and statistics of the (raw) files being processed to archive formats, metadata and search indexes.
Delivering full reporting of results, trace of pipeline steps taken and (stakeholder) accepted failures. Quarterly updates.

Requirements

At least 3 years of practical experience in the field of data science and/ or data analytics;
Experience using data processing/visualization/analytics software packages and development environments, preferably such as KNIME, VS Code, GitLab, Power BI, Jupyter Lab, and Docker-based API;
Experience with data processing Big Data, creating and utilizing containerized building blocks and running containers (APIs) on Kubernetes clusters;
Experience with programming/scripting in languages like Python, R, SQL and working with data formats like CSV, XML, JSON;
Experience performing content extraction from files/databases/systems, (LLM-based) embedding models, entity-extraction, key-word-extraction and content similarity measures;
Creative, flexible and pro-active overcoming obstacles;
Good drafting, communication and presentation skills in English, including technical and non-technical levels;
High attention to detail and accuracy;

Education

Master in Computer Science, Engineering or relevant field.
A higher degree in Data Science is preferred., * Valid National or NATO Secret personal security clearance

About the company

Spektrum supports apex purchasers (NATO, UN, EU, and National Government and Defence) and their Tier 1 supplier ecosystem with a wide range of specialist services. We provide our clients with professional services, specialised aerospace and defence sales, delivery, and operational subject matter expertise. We are looking for personnel to join our team and support key client projects., The NATO Communication and Information Agency (NCIA) is responsible for providing secure and effective communications and information technology (IT) services to NATO's member countries and its partners. The agency was established in 2012 and is headquartered in Brussels, Belgium. The NCIA provides a wide range of services, including: * Cyber Security: The NCIA provides advanced cybersecurity solutions to protect NATO's communication networks and information systems against cyber threats. * Command and Control Systems: The NCIA develops and maintains the systems used by NATO's military commanders to plan and execute operations. * Satellite Communications: The NCIA provides satellite communications services to enable secure and reliable communications between NATO forces. * Electronic Warfare: The NCIA provides electronic warfare services to support NATO's mission to detect, deny, and defeat threats to its communication networks. * Information Management: The NCIA manages NATO's information technology infrastructure, including its databases, applications, and servers. Overall, the NCIA plays a critical role in ensuring the security and effectiveness of NATO's communication and information technology capabilities. The program Assistance and Advisory Service (AAS) The NATO Communications and Information Agency (NCI Agency) is NATO's principal C3 capability deliverer and CIS service provider. It provides, maintains and defends the NATO enterprise-wide information technology infrastructure to enable Allies to consult together under Article IV, and, when required, stand together in the face of attack under Article V. To provide these critical services, in the modern evolving dynamic environment the NCI Agency needs to build and maintain high performance-engaged workforce. The NCI Agency workforce strategically consists of three major categorise's: NATO International Civilians (NIC)'s, Military (Mil), and Interim Workforce Consultants (IWC)'s. The IWCs are a critical part of the overall NCI Agency workforce and make up approximately 15 percent of the total workforce.

Role details

Job location

Tech stack

Job description

Requirements

About the company

Apply for this position

Good distractions

Moments

Videos View all