Data Scientist

SGF GLOBAL
Houston, United States of America
6 days ago

Role details

Contract type
Temporary contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 210K

Job location

Houston, United States of America

Tech stack

Artificial Intelligence
Artificial Neural Networks
Big Data
C++
Nvidia CUDA
Python
Machine Learning
NumPy
TensorFlow
SciPy
Sensor Fusion
Signal Processing
Software Engineering
PyTorch
Transfer Learning
Deep Learning
Keras
Pandas
Information Technology
Stable Diffusion

Job description

We are seeking a highly skilled Data Scientist to build, train, and deploy large-scale self-supervised "foundation" models for time-series, sensor, multimodal, and industrial scientific data. This role focuses on developing advanced deep learning architectures capable of learning rich representations from high-dimensional sequential signals, later fine-tuned for tasks such as:

  • Anomaly/event detection
  • Predictive maintenance
  • Forecasting
  • Classification
  • Multi-sensor fusion
  • Industrial/scientific modeling

This is a high-impact, research-driven role working with large datasets, complex sensor modalities, and distributed training infrastructure., 1. Foundation Model Development

  • Build and train self-supervised and semi-supervised foundation models for time-series and multimodal data
  • Fine-tune large models for domain-specific tasks
  • Apply contrastive learning, masked modeling, temporal predictive coding, multimodal alignment, etc.
  • Develop transfer learning, adapter, and prompt-based strategies for rapid downstream adaptation
  1. Data & Signal Processing
  • Process, augment, and engineer features for univariate/multivariate time-series datasets
  • Analyze IoT sensor streams, industrial vibration/temperature data, audio, imagery, etc.
  • Perform sampling, synchronization, denoising, artifact removal, and sensor quality checks
  • Integrate time series with images, structured data, audio, and text
  1. Advanced Machine Learning & Architectures
  • Build models using:
  • RNNs / GRU / LSTMs
  • TCNs
  • 1D/2D/3D CNNs
  • Transformers (BERT, ViT, TimeSFormer)
  • Graph Neural Networks
  • Diffusion / generative architectures
  • Multi-modal encoders and fusion models
  • Evaluate model performance using:
  • MSE, RMSE, R²
  • F1, AUC, Precision/Recall
  • DTW, correlation, similarity metrics
  • IoU and event-based segmentation metrics
  1. Software Engineering & Infrastructure
  • Build production-ready pipelines for ingesting, cleaning, segmenting, and aligning large-scale multi-sensor datasets
  • Develop in:
  • Python (NumPy, Pandas, SciPy)
  • PyTorch (Lightning, Distributed)
  • TensorFlow/Keras
  • JAX/Flax
  • C++/CUDA for custom kernels
  • Train models on:
  • Multi-GPU and multi-node clusters
  • Mixed-precision systems
  • Distributed optimization (ZeRO, DDP, etc.)
  1. Mathematical & Algorithmic Foundations
  • Apply strong background in:
  • Linear algebra, probability, and statistics
  • Signal processing (Fourier, wavelets, Kalman filters, noise modeling)
  • Optimization (stochastic, convex, non-convex)
  • Numerical methods, ODE/PDE modeling, regularization techniques
  1. Collaboration & Communication
  • Partner with scientists, engineers, domain experts, and product teams
  • Present model behavior insights, attention maps, and uncertainty quantification
  • Communicate findings clearly to both technical and non-technical audiences

Requirements

  • MS or PhD in Computer Science, Data Science, AI, Engineering, or related fields
  • 3+ years of experience in Data Science, Machine Learning, or AI
  • Strong experience building and training deep learning models
  • Experience working with time-series or sensor data
  • Proficiency in Python, deep learning frameworks, and ML engineering best practices, * Experience with multimodal learning
  • Experience with large-scale distributed training
  • Background in industrial, scientific, or sensor-driven AI

Apply for this position