
Data Science, Gen AI Specialist
Ford Motor Company
full-time
Posted on:
Location Type: Hybrid
Location: Chennai • India
Visit company websiteExplore more
Tech Stack
About the role
- Design NLP/LLM/GenAI applications/products by following robust coding practices.
- Explore SoTA models/techniques so that they can be applied for automotive industry usecases.
- Conduct ML experiments to train/infer models; if need be, build models that abide by memory & latency restrictions.
- Deploy REST APIs or a minimalistic UI for NLP applications using Docker and Kubernetes tools.
- Showcase NLP/LLM/GenAI applications in the best way possible to users through web frameworks (Dash, Plotly, Streamlit, etc.).
- Converge multibots into super apps using LLMs with multimodalities.
- Develop agentic workflow using Autogen, Agentbuilder, langgraph.
- Build modular AI/ML products that could be consumed at scale.
Requirements
- Bachelor’s or master’s degree in computer science, Engineering, Maths or Science
- Performed any modern NLP/LLM courses/open competitions is also welcomed.
- Experience in LLM models like PaLM, GPT4, Mistral (open-source models)
- Work through the complete lifecycle of Gen AI model development, from training and testing to deployment and performance monitoring.
- Developing and maintaining AI pipelines with multimodalities like text, image, audio etc.
- Have implemented in real-world Chat bots or conversational agents at scale handling different data sources.
- Experience in developing Image generation/translation tools using any of the latent diffusion models like stable diffusion, Instruct pix2pix.
- Expertise in handling large scale structured and unstructured data.
- Efficiently handled large-scale generative AI datasets and outputs.
- High familiarity in the use of DL theory/practices in NLP applications.
- Comfort level to code in Huggingface, LangChain, Chainlit, Tensorflow and/or Pytorch, Scikit-learn, Numpy and Pandas.
- Comfort level to use two/more of open source NLP modules like SpaCy, TorchText, fastai.text, farm-haystack, and others.
- Knowledge in fundamental text data processing (like use of regex, token/word analysis, spelling correction/noise reduction in text, segmenting noisy unfamiliar sentences/phrases at right places, deriving insights from clustering, etc.,)
- Have implemented in real-world BERT/or other transformer fine-tuned models (Seq classification, NER or QA) from data preparation, model creation and inference till deployment.
- Familiarity in the use of Docker tools, pipenv/conda/poetry env.
- Comfort level in following Python project management best practices (use of setup.py, logging, pytests, relative module imports,sphinx docs,etc.)
- Familiarity in use of Github (clone, fetch, pull/push,raising issues and PR, etc.)
- Use of GCP services like BigQuery, Cloud function, Cloud run, Cloud Build, VertexAI.
- Good working knowledge on other open-source packages to benchmark and derive summary.
- Experience in using GPU/CPU of cloud and on-prem infrastructures.
- Skillset to leverage cloud platform for Data Engineering, Big Data and ML needs.
- Use of Dockers (experience in experimental docker features, docker-compose, etc.)
- Familiarity with orchestration tools such as airflow, Kubeflow.
- Experience in CI/CD, infrastructure as code tools like terraform etc.
- Kubernetes or any other containerization tool with experience in Helm, Argoworkflow, etc.
- Ability to develop APIs with compliance, ethical, secure and safe AI tools.
- Good UI skills to visualize and build better applications using Gradio, Dash, Streamlit, React, Django, etc.
- Deeper understanding of javascript, css, angular, html, etc., is a plus.
- Skillsets to perform distributed computing (specifically parallelism and scalability in Data Processing, Modeling and Inferencing through Spark, Dask, RapidsAI or RapidscuDF).
- Ability to build python-based APIs (e.g.: use of FastAPIs/ Flask/ Django for APIs).
- Experience in Elastic Search and Apache Solr is a plus, vector databases.
Benefits
- Strong communication skills and do excellent teamwork through Git/slack/email/call with multiple team members across geographies.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
NLPLLMGenAIREST APIsDockerKubernetesPythonMachine LearningData EngineeringBig Data
Soft skills
communicationcollaborationproblem-solvingcreativityadaptabilitycritical thinkingattention to detailtime managementteamworkleadership
Certifications
Bachelor's degreeMaster's degreeNLP/LLM courses