aggregators, accessibility, benchmarking & researchers

piyush sagar mishra

Jul 08, 2019

https://medium.com/@ODSC/the-best-machine-learning-research-of-2019-so-far-954120947794
The uses of machine learning are expanding rapidly. Already in 2019, significant research has been done in exploring new vistas for the use of this technology. Gathered in this article is a list of some of the most exciting research that has been undertaken in the realm of machine learning so far this year - transfer learning, self supervised learning, competitive training, reliability assessment and much more
https://venturebeat.com/2019/07/01/ai-and-machine-learning-dominate-world-economic-forums-list-of-2019-technology-pioneers/
Last week, The World Economic Forum announced its list of 56 companies selected as Technology Pioneers, and this year’s class demonstrates the growing embrace of artificial intelligence and machine learning across a broad range of sectors.
Of those selected, at least 20 companies say they are using AI or machine learning in some fashion to tackle challenges in fields such as advertising technology, smart cities, cleantech, supply chain, manufacturing, cybersecurity, autonomous vehicles, and drones. The article describes a brief on each of those 20 companies
https://techxplore.com/news/2019-07-machine-learning-algorithms-uncover-hidden-scientific.html
Sure, computers can be used to play grandmaster-level chess (chess_computer), but can they make scientific discoveries? Researchers at the U.S. Department of Energy's Lawrence Berkeley National Laboratory (Berkeley Lab) have shown that an algorithm with no training in materials science can scan the text of millions of papers and uncover new scientific knowledge.
A team led by Anubhav Jain, a scientist in Berkeley Lab's Energy Storage & Distributed Resources Division, collected 3.3 million abstracts of published materials science papers and fed them into a Word2vec model. By analyzing relationships between words the algorithm was able to predict discoveries of new thermoelectric materials years in advance and suggest as-yet unknown materials as candidates for thermoelectric materials.
https://www.forbes.com/sites/janakirammsv/2019/07/01/google-just-made-machine-learning-more-accessible-and-portable-with-containers/#20ba12d64dcc
The rise of containers solved many challenges involved in traditional software development and deployment. DevOps teams can create container images that become the baseline for workloads running within the data center and in the public cloud. Data scientists and machine learning developers have started to adopt containers for creating consistent and repeatable data science environments. But updating, patching and maintaining the images is still left to the DevOps teams.
The ML and DL community has a piece of good news! Google has officially released a set of container images for mainstream machine learning and deep learning frameworks. Deep Learning Containers from Google provide the portability and consistency to move data science projects from on-premises to the cloud.
The best thing about Deep Learning Containers is that they are not locked to Google Cloud Platform. Though they work best with Google’s AI Platform and Google Kubernetes Engine, anyone can take advantage of these curated set of container images. Of course, you still need an account with Google Cloud Platform to pull the images from its registry. But you can use these images to train and infer deep learning models locally or even on other cloud platforms running Kubernetes.
https://venturebeat.com/2019/06/24/mlperf-introduces-machine-learning-inference-benchmark-suite/
A major consortium of AI community stakeholders has introduced MLPerf Inference v0.5, the group’s first suite for measurement of AI system power efficiency and performance. Inference benchmarks are essential to understanding just how much time and power is required to deploy a neural network for common tasks like computer vision that predicts the contents of an image.
The suite consists of 5 benchmarks that include English-German machine translations with the WMT English-German data set, 2 object detection benchmarks with the COCO data set, and 2 image classification benchmarks with the ImageNet data set.
https://towardsdatascience.com/12-things-i-learned-during-my-first-year-as-a-machine-learning-engineer-2991573a9195
Machine learning and data science are both broad terms. What one data scientist does can be very different to another. The same goes for a machine learning engineer. What’s common is using the past (data) to understand or predict (build models) the future.
In this article, Daniel Bourke explains the importance of being your own biggest sceptic - the value in trying things which might not work and why communication problems are harder than technical problems. To put the points below in context, Daniel had a small machine learning consulting team. And they did it all, from data collection to manipulation to model building to service deployment in every industry you can think of. So everyone wore many hats. Click on the article to read about the learning gathered under each hat.
https://ai.googleblog.com/2019/06/innovations-in-graph-representation.html
Relational data representing relationships between entities is ubiquitous on the Web (e.g., online social networks) and in the physical world (e.g., in protein interaction networks). Such data can be represented as a graph with nodes (e.g., users, proteins), and edges connecting them (e.g., friendship relations, protein interactions). Given the widespread prevalence of graphs, graph analysis plays a fundamental role in machine learning, with applications in clustering, link prediction, privacy, and others. To apply machine learning methods to graphs (e.g., predicting new friendships, or discovering unknown protein interactions) one needs to learn a representation of the graph that is amenable to be used in ML algorithms.

However, graphs are inherently combinatorial structures made of discrete parts like nodes and edges, while many common ML methods, like neural networks, favor continuous structures, in particular vector representations. Vector representations are particularly important in neural networks, as they can be directly used as input layers. To get around the difficulties in using discrete graph representations in ML, graph embedding methods learn a continuous vector space for the graph, assigning each node (and/or edge) in the graph to a specific position in a vector space. A popular approach in this area is that of random-walk-based representation learning, as introduced in DeepWalk.
In this article, Google presents the results of two recent papers on graph embedding: “Is a Single Embedding Enough? Learning Node Representations that Capture Multiple Social Contexts” presented at WWW’19 and “Watch Your Step: Learning Node Embeddings via Graph Attention” at NeurIPS’18. The first paper introduces a novel technique to learn multiple embeddings per node, enabling a better characterization of networks with overlapping communities. The second addresses the fundamental problem of hyperparameter tuning in graph embeddings, allowing one to easily deploy graph embeddings methods with less effort. Google also announced that they have released the code for both papers in the Google Research github repository for graph embeddings.
https://www.technologyreview.com/s/613846/a-new-deepfake-detection-tool-should-keep-world-leaders-safefor-now/
An AI-produced video could show Donald Trump saying or doing something extremely outrageous and inflammatory. It would be only too believable, and in a worst-case scenario it might sway an election, trigger violence in the streets, or spark an international armed conflict.
Fortunately, a new digital forensics technique promises to protect President Trump, other world leaders, and celebrities against such deepfakes—for the time being, at least. The new method uses machine learning to analyze a specific individual’s style of speech and movement, what the researchers call a “softbiometric signature.”
https://www.technologyreview.com/s/613899/machine-learning-has-been-used-to-automatically-translate-long-lost-languages/
It’s not hard to imagine that recent advances in machine translation might help translate long lost languages. In just a few years, the study of linguistics has been revolutionized by the availability of huge annotated databases, and techniques for getting machines to learn from them. Consequently, machine translation from one language to another has become routine. And although it isn’t perfect, these methods have provided an entirely new way to think about language.
Enter Jiaming Luo and Regina Barzilay from MIT and Yuan Cao from Google’s AI lab in Mountain View, California. This team has developed a machine-learning system capable of deciphering lost languages, and they’ve demonstrated it by having it decipher Linear B — (a language that dates from between 1800 and 1400 BCE, when the Mediterranean island of Crete was dominated by the Bronze Age Minoan civilization)—the first time this has been done automatically. The approach they used was very different from the standard machine translation techniques.
https://ai.googleblog.com/2019/06/predicting-bus-delays-with-machine.html
Hundreds of millions of people across the world rely on public transit for their daily commute, and over half of the world's transit trips involve buses. As the world's cities continue growing, commuters want to know when to expect delays, especially for bus rides, which are prone to getting held up by traffic. While public transit directions provided by Google Maps are informed by many transit agencies that provide real-time data, there are many agencies that can’t provide them due to technical and resource constraints.

Last month, Google Maps introduced live traffic delays for buses, forecasting bus delays in hundreds of cities world-wide, ranging from Atlanta to Zagreb to Istanbul to Manila and more. This improves the accuracy of transit timing for over sixty million people. This system, first launched in India a few weeks ago, is driven by a machine learning model that combines real-time car traffic forecasts with data on bus routes and stops to better predict how long a bus trip will take.

aggregators, accessibility, benchmarking & researchers

Discussion about this post