Sr. Software Engineer, Data and Machine Learning, BirdwatchRegular price
Twitter is what’s happening in the world and what people are talking about right now. From breaking news and entertainment to sports, politics, and everyday interests, see every side of the story. Join the open conversation, and collaborate with creative and curious people across the globe.
“The whole world is watching Twitter. You don't go a day without hearing about Twitter, how it’s used as the fastest way to send a message to the world in an instant, how it carries some of the most important commentary and conversations, how it mobilizes people into action. That's powerful, it's valuable, it's fundamental.” - Jack Dorsey
Twitter is seeking a Senior Software Engineer, Data for Birdwatch, our pilot program in a new crowdsourced/participatory approach to reducing misleading information. Birdwatch pushes the state-of-the-art in approaching misleading information on the Internet and we employ a deeply experimental, fast-moving, and iterative approach to find product solutions that work for customers.
We are looking for someone who can design, develop, and launch efficient and reliable data pipelines and production machine learning systems. You’ll work as part of a cross-functional team including machine learning, engineering, data science, research, design, product, and even academic experts outside the company who study the space.
What You’ll Do
As Birdwatch grows beyond its pilot phase, you will be focusing on the software engineering and infrastructure required for building large-scale machine learning applications. You will play a critical role in scaling and launching the core algorithms and data pipelines that power Birdwatch in real-time, including
- Computing contributor helpfulness scores that are resistant to adversaries and bad actors, e.g. using iterative graph propagation algorithms in a similar style as PageRank, or explicit coordinated manipulation/spam detection
- Detecting rater similarity/diversity and polarization, using techniques such as matrix factorization to learn user embeddings or similarities
- Determining which users to ask for ratings from in real time, and determining an overall label from the crowd
- Building metrics to evaluate our production algorithms, including using human-in-the-loop data
- Open-sourcing as much of our core algorithmic code as possible, in the spirit of Birdwatch’s transparency
- B.S. and/or M.S. in Computer Science or a related technical field, or equivalent experience
- 3+ years of experience in backend systems or distributed systems/large scale data processing
- Experience and familiarity with the modern data pipeline and ML infrastructure ecosystem
- Great understanding of one or more of the following Scala, C++, or Java
- Proficiency with Python and SQL
Bonus Qualifications (but Not Necessary)
- Experience owning a production machine learning system and/or pipeline
- Experience with specific technologies you are likely to use, e.g. Kubernetes, Kubeflow Pipelines, TFX, Dataflow/Beam, BigQuery, Airflow.
- Basic familiarity with statistics and machine learning, especially in relevant domains e.g. graph algorithms, matrix factorization, using human-in-the-loop data, bad actor/manipulation modeling, game theory, interpretable and fair ML, active learning, etc.
Who You Are
- You’ve built and maintained a large-scale distributed system and/or machine learning data pipeline
- Passion for the problem of misleading information & creating a better informed world
- Enjoy a rapid iterative approach / the 0-to-1 experience
- Enioy working in an ambiguous space with no known product solutions, and trailblazing to build novel solutions that work
- Ability to move fast and get things done, and help the broader team do so (to enable rapid iteration on the product)
- Comfortable (maybe even excited) to work with a distributed team
- You are experienced with software engineering best practices and bring a disciplined approach to testing and driving reductions in technical debt.