social cause, bias in algorithms and people, double clicking cloud & open source
Major releases, updates & announcements
Google launches TensorFlow machine learning framework for graphical data
Google recently introduced Neural Structured Learning (NSL), an open source framework that uses the Neural Graph Learning method for training neural networks with graphs and structured data.
NSL works with with the TensorFlow machine learning platform and is made to work for both experienced and inexperienced machine learning practitioners. NSL can make models for computer vision, perform NLP, and run predictions from graphical datasets like medical records or knowledge graphs.
Waymo Shares Autonomous Vehicle Dataset for Machine Learning
Waymo, the self-driving technology company owned by Google's parent company, Alphabet, released a dataset containing sensor data collected by their autonomous vehicles during more than five hours of driving. The set contains high-resolution data from lidar and camera sensors collected in several urban and suburban environments in a wide variety of driving conditions, and includes labels for vehicles, pedestrians, cyclists, and signage. More here
Comparing Machine Learning as a Service: Amazon, Azure, Google Cloud AI, IBM Watson
Machine learning as a service (MLaaS) is an umbrella definition of various cloud-based platforms that cover most infrastructure issues such as data pre-processing, model training, and model evaluation, with further prediction. Prediction results can be bridged with your internal IT infrastructure through REST APIs.
Amazon Machine Learning services, Azure Machine Learning, Google Cloud AI, and IBM Watson are four leading cloud MLaaS services that allow for fast model training and deployment. These should be considered first if you assemble a homegrown data science team out of available software engineers.
AI researchers from IBM open sourced AI Explainability 360, a new toolkit of state-of-the-art algorithms that support the interpretability and explainability of machine learning models
Google Open-Sources Real-Time Hand Tracking for Android and iOS
Google has open-sourced a new component for its MediaPipe framework aimed to bring real-time hand detection and tracking to mobile devices.
Google's algorithm uses machine learning (ML) techniques to detect 21 keypoints from a single frame and can be used with multiple hands. Its ability to provide real-time performance on mobile devices sets it apart from competing approaches, which requires desktop performance, Google says. It is integrated within MediaPipe, a graph-based framework for building applied machine learning pipelines involving video, audio, and sensor data.
Predicting the Future, Amazon Forecast Reaches General Availability
In a recent blog post, Amazon announced the general availability (GA) of Amazon Forecast, a fully managed, time series data forecasting service. Amazon Forecast uses deep learning from multiple datasets and algorithms to make predictions in the areas of product demand, travel demand, financial planning, SAP and Oracle supply chain planning and cloud computing usage
Open-sourcing hyperparameter autotuning for fastText
The new feature released by Facebook for its fasttext library automatically determines the best hyperparameters for your data set in order to build an efficient text classifier. To use autotuning, a researcher inputs the training data as well as a validation set and a time constraint. FastText then uses the allotted time to search for the hyperparameters that give the best performance on the validation set. Optionally, the researcher can also constrain the size of the final model. In such cases, fastText uses compression techniques to reduce the size of the model. More here
A deep learning technique for context-aware emotion recognition
A key limitation of conventional emotion recognition tools is that they fail to achieve satisfactory performance when emotional signals in people's faces are ambiguous or indistinguishable. In contrast with these approaches, human beings are able to recognize others' emotions based not only on their facial expressions, but also on contextual clues (e.g., the actions they are performing, their interactions with others, where they are, etc.).
Now, a team of researchers at Yonsei University and École Polytechnique Fédérale de Lausanne (EPFL) has recently developed a new technique that can recognize emotions by analyzing people's faces in images along with contextual features. They presented and outlined their deep learning-based architecture, called CAER-Net, in a paper pre-published on arXiv.
In-house data sciences capabilities and foundations of giants
An overview of how LinkedIn, Uber, Lyft, Airbnb and Netflix are Solving Data Management and Discovery for Machine Learning Solutions
New advances in natural language processing to better connect people
Facebook AI has achieved impressive breakthroughs in NLP using semi-supervised and self-supervised learning techniques, which leverage unlabeled data to improve performance beyond purely supervised systems. The article provides an overview of different achievements made through FB AI team in the last few months
Gender and Racial Bias in Cloud NLP Sentiment APIs
In the course of building tools that support NLP, users have often encountered and have had to work around gender and racial bias that gets baked into the machine learning models that we use for text analysis. This is an acknowledged problem confronting NLP and the solutions are not simple. Building fair and non-toxic NLP systems requires constant vigilance, and companies are continuously auditing new platforms and models to make sure that the users of our systems are not adversely impacted.
In the course of these audits, the Data Team at Automattic found evidence of gender and racial bias in the sentiment analysis services offered by Amazon (called Amazon Comprehend) and to a much lesser extent by Google (part of its Cloud Natural Language API). This article provides an in-depth analysis of their observations
Autonomous Vehicles for Social Good: Learning to Solve Congestion
The proper deployment of Autonomous vehicles should minimize gridlock, decrease total energy consumption, and maximize the capacity of our roadways. While there have been decades of research on these questions, there isn’t an existing consensus on the optimal driving strategies to employ, nor easy metrics by which a self-driving car company could assess a driving strategy and then choose to implement it in their own vehicles. The AI research team at Berkeley postulated that a partial reason for this gap is the absence of benchmarks - standardized problems which we can use to compare progress across research groups and methods - and released their <CORL paper> that proposes 11 new benchmarks in centralized mixed-autonomy traffic control: traffic control where a small fraction of the vehicles and traffic lights are controlled by a single computer
Interview with Dask’s creator: Scale your Python from one computer to a thousand
Software engineering for machine learning: a case study
An internal study by Adrian Colyer (with over 500 participants) designed to figure out how product development and software engineering is changing at Microsoft with the rise of AI and ML.
Machine learning and bias - impact of bias and ways of eliminating bias from machine learning models
Watch worthy
Ideas from Robin Hauser on how do we go about protecting AI from our biases
Interesting nuggets on how algorithms shape (and, not shape) our world
Unarchiving Arxiv
Clone them now <repositories that deserve a click>