news    views    talk    learn    |    about    contribute     republish     crowdfunding     archives     events
by   -   December 2, 2016
Source: commaai
Source: commaai

There have been few postings this month, as I took the time to enjoy a holiday in New Zealand around speaking at the SingularityU New Zealand summit in Christchurch. The night before the summit, we enjoyed a 7.8 earthquake not so far from Christchurch, whose downtown was 2/3 demolished after quakes in 2010 and 2011. On the 11th floor of the hotel it was a disturbing nailbiter of swaying back and forth for over 2 minutes — but of course, swaying is what the building is supposed to do, that means it’s working. The shocks were rolling, not violent, and in fact, we got more violent jolts from aftershocks a week later when we went to Picton.

by   -   December 1, 2016
Tomaso Poggio, a professor of brain and cognitive sciences at MIT and director of the Center for Brains, Minds, and Machines, has long thought that the brain must produce “invariant” representations of faces and other objects, meaning representations that are indifferent to objects’ orientation in space, their distance from the viewer, or their location in the visual field. Image Credit: Massachusetts Institute of Technology
Image: Massachusetts Institute of Technology

MIT researchers and their colleagues have developed a new computational model of the human brain’s face-recognition mechanism that seems to capture aspects of human neurology that previous models have missed.

by   -   December 1, 2016


Vidi Systems from Switzerland is the overall Grand Winner of the 2016 Robot Launch global startup competition, beating out many US contestants in a field that included sensors, artificial intelligence, social robots, service robots and industrial solutions. Overall, the European robotics startups performed very strongly this year with 8 making The Shortlist for awards. Canada also had good representation with 3 entries, but the rest of The Shortlist were based in the USA, even if they had originated in Israel or Hong Kong.

by   -   December 1, 2016


Seventeen robotics-related companies got funded for a combined total of over $225 million. Four more got acquired. Three went public to raise funds. And one failed.

by   -   December 1, 2016



Yesterday, Onalytica released a report “Robotics: Top 100 Influencers and their Brands” that analysed interactions between robotics influencers and brands on Twitter over a period of 90 days. What they found was pretty interesting:

“ we analysed over 551K tweets mentioning the keywords: “Robotics OR Robotic from 18th September – 18th November 2016. We then identified the top 100 most influential brands and individuals leading the discussion on Twitter. What we discovered was a very engaged community, with much discussion between individuals and brands.”

We’re proud to say that Robohub, as a brand, was ranked #10 out of 100. That makes us one of the top robotics-focused blogs. Not too shabby for a non-profit organisation that started 4 years ago!

Source: Onalytica
Source: Onalytica

Contributing Robohub experts were also on the list, namely: Ryan Calo, Andra Keay, Sammy Payne, Sabine Hauert, and Hallie Siegel.

You can download the full report for free here, which describes their PageRank methodology, data, and rankings.

by   -   November 30, 2016


U.S. Sen. Ted Cruz (R-Texas), chairman of the Subcommittee on Space, Science, and Competitiveness will convene a hearing today, 30 November, at 2:30 p.m. EST on “The Dawn of Artificial Intelligence.” The hearing will conduct a broad overview of the state of artificial intelligence, including policy implications and effects on commerce.

by   -   November 30, 2016


The common, and recurring, view of the latest breakthroughs in artificial intelligence research is that sentient and intelligent machines are just on the horizon. Machines understand verbal commands, distinguish pictures, drive cars and play games better than we do. How much longer can it be before they walk among us?

The new White House report on artificial intelligence takes an appropriately skeptical view of that dream. It says the next 20 years likely won’t see machines “exhibit broadly-applicable intelligence comparable to or exceeding that of humans,” though it does go on to say that in the coming years, “machines will reach and exceed human performance on more and more tasks.” But its assumptions about how those capabilities will develop missed some important points.

As an AI researcher, I’ll admit it was nice to have my own field highlighted at the highest level of American government, but the report focused almost exclusively on what I call “the boring kind of AI.” It dismissed in half a sentence my branch of AI research, into how evolution can help develop ever-improving AI systems, and how computational models can help us understand how our human intelligence evolved.

The report focuses on what might be called mainstream AI tools: machine learning and deep learning. These are the sorts of technologies that have been able to play “Jeopardy!” well, and beat human Go masters at the most complicated game ever invented. These current intelligent systems are able to handle huge amounts of data and make complex calculations very quickly. But they lack an element that will be key to building the sentient machines we picture having in the future.

We need to do more than teach machines to learn. We need to overcome the boundaries that define the four different types of artificial intelligence, the barriers that separate machines from us – and us from them.

Type I AI: Reactive machines

The most basic types of AI systems are purely reactive and have the ability neither to form memories nor to use past experiences to inform current decisions. Deep Blue, IBM’s chess-playing supercomputer, which beat international grandmaster Garry Kasparov in the late 1990s, is the perfect example of this type of machine.

Deep Blue can identify the pieces on a chess board and know how each moves. It can make predictions about what moves might be next for it and its opponent. And it can choose the most optimal moves from among the possibilities.

But it doesn’t have any concept of the past, nor any memory of what has happened before. Apart from a rarely used chess-specific rule against repeating the same move three times, Deep Blue ignores everything before the present moment. All it does is look at the pieces on the chess board as it stands right now, and choose from possible next moves.

This type of intelligence involves the computer perceiving the world directly and acting on what it sees. It doesn’t rely on an internal concept of the world. In a seminal paper, AI researcher Rodney Brooks argued that we should only build machines like this. His main reason was that people are not very good at programming accurate simulated worlds for computers to use, what is called in AI scholarship a “representation” of the world.

The current intelligent machines we marvel at either have no such concept of the world, or have a very limited and specialized one for its particular duties. The innovation in Deep Blue’s design was not to broaden the range of possible movies the computer considered. Rather, the developers found a way to narrow its view, to stop pursuing some potential future moves, based on how it rated their outcome. Without this ability, Deep Blue would have needed to be an even more powerful computer to actually beat Kasparov.

Similarly, Google’s AlphaGo, which has beaten top human Go experts, can’t evaluate all potential future moves either. Its analysis method is more sophisticated than Deep Blue’s, using a neural network to evaluate game developments.

These methods do improve the ability of AI systems to play specific games better, but they can’t be easily changed or applied to other situations. These computerized imaginations have no concept of the wider world – meaning they can’t function beyond the specific tasks they’re assigned and are easily fooled.

They can’t interactively participate in the world, the way we imagine AI systems one day might. Instead, these machines will behave exactly the same way every time they encounter the same situation. This can be very good for ensuring an AI system is trustworthy: You want your autonomous car to be a reliable driver. But it’s bad if we want machines to truly engage with, and respond to, the world. These simplest AI systems won’t ever be bored, or interested, or sad.

Type II AI: Limited memory

This Type II class contains machines can look into the past. Self-driving cars do some of this already. For example, they observe other cars’ speed and direction. That can’t be done in a just one moment, but rather requires identifying specific objects and monitoring them over time.

These observations are added to the self-driving cars’ preprogrammed representations of the world, which also include lane markings, traffic lights and other important elements, like curves in the road. They’re included when the car decides when to change lanes, to avoid cutting off another driver or being hit by a nearby car.

But these simple pieces of information about the past are only transient. They aren’t saved as part of the car’s library of experience it can learn from, the way human drivers compile experience over years behind the wheel.

So how can we build AI systems that build full representations, remember their experiences and learn how to handle new situations? Brooks was right in that it is very difficult to do this. My own research into methods inspired by Darwinian evolution can start to make up for human shortcomings by letting the machines build their own representations.

Type III AI: Theory of mind

We might stop here, and call this point the important divide between the machines we have and the machines we will build in the future. However, it is better to be more specific to discuss the types of representations machines need to form, and what they need to be about.

Machines in the next, more advanced, class not only form representations about the world, but also about other agents or entities in the world. In psychology, this is called “theory of mind” – the understanding that people, creatures and objects in the world can have thoughts and emotions that affect their own behavior.

This is crucial to how we humans formed societies because they allowed us to have social interactions. Without understanding each other’s motives and intentions, and without taking into account what somebody else knows either about me or the environment, working together is at best difficult, at worst impossible.

If AI systems are indeed ever to walk among us, they’ll have to be able to understand that each of us has thoughts and feelings and expectations for how we’ll be treated. And they’ll have to adjust their behavior accordingly.

Type IV AI: Self-awareness

The final step of AI development is to build systems that can form representations about themselves. Ultimately, we AI researchers will have to not only understand consciousness, but build machines that have it.

This is, in a sense, an extension of the “theory of mind” possessed by Type III artificial intelligences. Consciousness is also called “self-awareness” for a reason. (“I want that item” is a very different statement from “I know I want that item.”) Conscious beings are aware of themselves, know about their internal states, and are able to predict feelings of others. We assume someone honking behind us in traffic is angry or impatient, because that’s how we feel when we honk at others. Without a theory of mind, we could not make those sorts of inferences.

While we are probably far from creating machines that are self-aware, we should focus our efforts toward understanding memory, learning and the ability to base decisions on past experiences. This is an important step to understand human intelligence on its own. And it is crucial if we want to design or evolve machines that are more than exceptional at classifying what they see in front of them.

This article was originally published on The Conversation. Read the original article.

If you enjoyed this article, you may also want to read these articles about AI: 

See all the latest robotics news on Robohub, or sign up for our weekly newsletter.

by   -   November 30, 2016


I was reading Steve Blank’s blog on machine learning start-ups yesterday, where he described how technical infrastructure innovations follow a Hype Cycle with characteristics defined by the Gartner Group.

The Hype Cycle follows the life cycle of emerging technologies and will be applicable to the new generation of robotic vision technologies currently being developed by the Australian Centre for Robotic Vision. Robotic vision, as the name suggests, encompasses both robotics and computer vision, each has been moving through new hype cycles, despite such technologies first appearing more than 50 years ago.

As described by Gartner Group, hype cycles progress as follows:

Phase 1 – Technology Trigger: A potential technology breakthrough kicks things off. Early proof-of-concept stories and media interest trigger significant publicity. Often no usable products exist and commercial viability is unproven.

Phase 2 – Peak of Inflated Expectations: Early publicity produces a number of success stories — often accompanied by scores of failures. Some companies take action; many do not.

Phase 3 – Trough of Disillusionment: Interest wanes as experiments and implementations fail to deliver. Producers of the technology shake out or fail. Investments continue only if the surviving providers improve their products to the satisfaction of early adopters.

Phase 4 – Slope of Enlightenment: More instances of how the technology can benefit the enterprise start to crystallise and become more widely understood. Second- and third-generation products appear from technology providers. More enterprises fund pilots; conservative companies remain cautious.

Phase 5 – Plateau of Productivity: Mainstream adoption starts to take off. Criteria for assessing provider viability are more clearly defined. The technology’s broad market applicability and relevance are clearly paying off.

The new wave of robotic and computer vision technologies are in Phase 2 of the hype cycle, the peak of inflated expectations. Each year scores of robotics and computer vision start-ups are acquired, often for huge amounts of money. My sister, Andra Keay, is the managing director of Silicon Valley Robotics, which supports innovation and commercialisation of robotic technologies. At one stage it was common for new robotics start-ups to exit the start-up scene before even finding their feet, acquired by larger players. See this great timeline of robotics acquisitions put together by CB Insights:


The story is similar for computer vision start-ups. Index lists just some of the recently acquired CV companies. Some are acquired with little fanfare, such as the speculated acquisition of ZurichEye by Facebook. Computer vision sometimes applies artificial intelligence (AI) and machine learning, the subject of Steve Blank’s blog, and all technologies seem to be hitting the peak of inflated expectations. Phase 2 is a great place to be if you’re a start-up but not necessarily a great place to be for the acquirer, who gains technology, teams and tools but not necessarily much additional revenue and profit.

Robotic vision is the gateway to a whole new set of technologies, developed by bringing robotics and computer vision together. While robotics is about machines that perceive and interact with the physical work, computer vision involves methods for acquiring, processing, analysing and understanding images using a computer. Combining the two produces the key technologies that will allow robotics to transform the way we live and work by giving robots visual perception. So, where is robotic vision on the hype cycle?

Robotic Vision is in Phase 1, just kicking off, with research groups like the Australian Centre for Robotic Vision developing a few proof-of-concepts like the crown-of-thorns starfish robot, COTSbot and the agricultural robots AgBotII and Harvey. It will soon be heading into Phase 2, which is a great time for new start-ups looking for a big early exit — but will the window of opportunity for robotic vision companies be narrow?

Given the high level of current interest in fields related to robotic vision, start-ups developed around these new technologies will need to enter the market soon. Companies based on the interlinked fields of robotics, machine learning, AI and computer vision have been acquired by corporate giants such as Google, IBM, Yahoo, Intel, Apple and Salesforce, at increasing rates since 2011 (see CB Insights). How long can the hype continue before we enter into Phase 3, the trough of disillusionment?

The risk for robotic vision is that by the time start-ups have formed around newly created technologies, the hype cycle for related technologies will change. If robotic vision doesn’t get a foot in the door, RV technologies will skip the peak of inflated expectations altogether. On the plus side, robotic vision technologies will also enjoy an accelerated path towards the slope of enlightenment on the heels of related technologies. Good news for the application of robotic vision to solve real world challenges but bad news for robotic vision start-ups – unless they form soon.

by   -   November 29, 2016


Dear Readers,
For #GivingTueday, we would like to ask you, our loyal readers, to consider donating to Robohub. From the beginning, our mission has been to help demystify robotics by hearing straight from the experts. Robohub isn’t like most news websites. We’re a community. We’re a forum. Much of our content is written directly by the experts in academia, businesses, and industry. That means you get to learn about the latest research and business news, events and opinions, directly from the experts, unfiltered, with no media bias. Our goal is to keep you engaged and interested in robotics that may not necessarily be covered by top news agencies.

by   -   November 29, 2016
Hiroshi Ota and Minako Inoue with 2 Robovie R3 robots in Oriza Hirata's "I, Worker"
Hiroshi Ota and Minako Inoue with 2 Robovie R3 robots in Oriza Hirata’s “I, Worker”

How can robotics help to enhance the development of the modern arts? Japan’s famous playwright, stage director Oriza Hirata and leading roboticist Hiroshi Ishiguro launched the “Robot Theater Project” at Osaka University to explore the boundary between human-robot interactions through robot theater. Their work includes renditions of Anton Chekhov’s “Three Sisters”, Franz Kafka’s “The Metamorphosis”, and their own play “I, Worker”. Their work has spread internationally to Paris, New York, Toronto and Taipei.

For this interview, we would like to invite their collaboration partner Yi-Wei Keng, director of Taipei Arts Festival, to share his insights on the intersection of robotics and the arts.

by   -   November 29, 2016


Ethernet is the most pervasive communication standard in the world. However, it is often dismissed for robotics applications because of its presumed non-deterministic behavior. In this article, we show that in practice Ethernet can be extremely deterministic and provide a flexible and reliable solution for robot communication.

by   -   November 29, 2016
Roboethics panel in Amsterdam.

What ethical issues do we face in providing robot care for the elderly? Is there better acceptance with the public? What should we be mindful of when designing human-robot interactions?

At the #ERW2016 central event, held in Amsterdam 18-22 November, these questions (and more) were discussed, debated, and encouraged by expert panellists hailing from research, industry, academia, and government as well as insightful members in the community. All were welcome to ‘Robots at Your Service’, a multi-track event featuring panel deliberations in robotics regulation, assistive living technologies, and aimed at attracting more youth, and especially girls, into science, technology, engineering, arts and maths (STEAM). The event hosted workshops and featured a 48-hour hackathon for designers, makers, coders, engineers, and anyone else who believed healthy ageing should be a societal challenge.


India’s Rustom II medium-altitude long-endurance drone during a test flight in early November. Credit: DRDO
India’s Rustom II medium-altitude long-endurance drone during a test flight in early November. Credit: DRDO

November 21, 2016 – November 27, 2016


A U.S. drone strike in Syria killed Abu Afghan al-Masri, a senior leader of al-Qaeda. Pentagon spokesperson Peter Cook confirmed that the strike took place near the town of Sarmada, in Aleppo province, on November 18. (Voice of America)

The National Transportation Safety Board is investigating an accident involving Facebook’s Aquila prototype drone during a test flight in June. According to an NTSB spokesperson, the drone experienced “structural failure” during the test flight. The Aquila is a high-altitude long-endurance drone that Facebook plans to use to beam Internet to remote areas. (Wall Street Journal)

The U.K. Civil Aviation Authority has revised the wording of its flight rules for drones. The move is part of the CAA’s push to increase awareness around responsible drone use. There have been several reported close encounters between drones and manned aircraft in U.K. airspace in recent months. (BBC)

Commentary, Analysis, and Art

At the Washington Post, Christian Davenport looks at the Pentagon’s efforts to develop unmanned undersea vehicles. For more on underwater drones, click here.

At Reuters, Ulf Laessing writes that ISIL is using drones to scout Iraqi Army locations and launch attacks in Mosul.

At Just Security, Philip Bobbitt reviews The Drone Memos: Targeted Killing, Secrecy, and the Law by Jameel Jaffer.

At Recode, April Glaser looks at how recent drone manufacturers DJI and GoPro did not have their new drones ready for sale in time for Black Friday.

At the New York Times, Adam Goldman and Eric Schmitt examine the U.S. strategy of targeting ISIL social media experts and recruiters with drones.

Also at the New York Times, Heidi Hutner offers a progress report on drone delivery tests in Madagascar.

At LobeLog, Eli Clifton looks at the connections between retired Lt. Gen. Michael Flynn and a drone company that has received border security contracts.

At Offiziere, Darien Cavanaugh examines DARPA’s push to develop a system to counter small, cheap consumer drones.

In a speech to the Airport Operators Association, Chris Grayling, the U.K. transport secretary, said that he had “less enthusiasm for a completely liberal market” for drones. (Daily Mail)

A report by PricewaterhouseCoopers concludes that the agricultural drone industry could be worth $32.4 billion. (The Motley Fool)

American photographer Johnny Miller used a drone to capture aerial images of inequality in Nairobi, Kenya. (Quartz)

Meanwhile, photographer Michael B. Rasmussen has been using a drone to take aerial images of the Danish countryside in fall. (Wired)

Know Your Drone

Aerospace firm Boeing has been awarded a patent for an automatic recharging station for small military drones. (Popular Science)

The FAA and the Nevada Institute for Autonomous Systems are testing counter-drone systems technology at the Nevada UAS test site. (Press Release)

Australia’s Defence Science and Technology program is developing autonomous unmanned air and ground vehicles as part of its Future Soldier initiative. (Herald Sun)

A team at the University of Alaska Fairbanks is testing a high-altitude unmanned glider as part of a program to develop technologies to track vehicles returning from space. (Your Alaska Link)

Telecommunications firm Nokia and the U.A.E. General Civil Aviation Authority are partnering to develop a drone air traffic management system (Press Release)

Robotics firm Roboteam has unveiled an updated version of its Probot unmanned ground vehicle. (IHS Jane’s 360)

U.K. insurance firm Direct Line has proposed a system of drone streetlights that can follow pedestrians and light their path. (The Verge)

South Korea’s Agency for Defense Development is developing an electromagnetic pulse generator to use against North Korean Drones. (Yonhap)

Indian officials have announced that the military’s developmental Rustom-2 drone, which has been renamed Tapas-BH 201, won’t have strike capabilities. (Times of India)

Singapore-based industrial company H3 Dynamics has unveiled the HYWINGS, a long-endurance commercial drone. (Deccan Chronicle)

Defense firms Schiebel and Israel Aerospace Industries are testing a new sensor payload that listens to communications from adversary targets. (FlightGlobal)

The U.S. Navy is planning to solicit proposals for the development of its Extra Large Unmanned Undersea Vehicle. (FBO)

A group of Ukrainian engineers has developed a multi-role long-endurance quadcopter drone. (Ukraine Today)

Japanese e-commerce company Rakuten Inc. conducted a demonstration flight of its developmental drone delivery system. (Enterprise Innovation)

Drones at Work

Email chains released by the U.K. Civil Aviation Authority show that Amazon began testing delivery drones at a secret site at least a year earlier than previously thought. The chain refers to tests conducted as early as the summer of 2015, but it was only publicly revealed that the program had begun in summer 2016. (Business Insider)

The Swiss Society for Rescue Dogs is teaming up with the Swiss Federation of Civil Drones to use unmanned aircraft during search operations. (The Local)

Mapping drones are being used in New Zealand to assist in the planning for a major highway project. (Stuff)

The Fallon Police Department in Nevada used a drone in a mock search exercise. (Nevada Appeal)

The UNHCR is exploring the use of drones to map refugee populations in Africa. (Relief Web)

The Tulare County Sheriff’s Department in California has initiated a drone program. (Fresno Bee)

Officials in Lewes, Delaware have decided against an ordinance that would regulate drones in the city. (Cape Gazette)

A drone was used to deliver mobile phones and other items to an inmate at Nyborg Prison in Denmark. (Reuters)

Finland’s Defence Forces have spotted various drones flying over military facilities and exercises. (YLE)

A drone was used in Co Clare, Ireland to help search for a missing woman. (Irish Examiner)

General Atomics Aeronautical Systems is dedicating a company-owned Avenger jet-powered military drone for humanitarian relief operations. (Aviation Week)

An Indiana conservation officer has been granted permission to operate a drone for search and rescue missions. (Kokomo Tribune)

Drone maker Autel robotics published a video showing how a drone can be used to help prepare a Thanksgiving meal. (Gizmodo)

Industry Intel

The U.S. Air Force awarded General Atomics Aeronautical Systems a $39.8 million contract modification to extend the range on the MQ-9 Reaper. (Contract Announcement)

Kratos Defense & Security Solutions announced that it had been awarded a $17.8 million contract for BQM-167i target drones from an unidentified international customer. (Shephard Media)

The U.S. Navy awarded Northrop Grumman a $10.4 million contract to increase production of the MQ-8C Fire Scout. (FBO)

The Department of the Interior awarded 3D Robotics a $5,081 contract for unmanned aircraft systems. (Contract Announcement)

The European Maritime Safety Agency awarded Martek Marine a $10.6 million contract for drones that will monitor marine pollution levels. (BBC)

GoPro is offering free Hero 5 sport cameras and full refunds to customers who bought the Karma drone before the recall. (Investopedia)

For updates, news, and commentary, follow us on Twitter. The Weekly Drone Roundup is a newsletter from the Center for the Study of the Drone. It covers news, commentary, analysis and technology from the drone world. You can subscribe to the Roundup here.

by   -   November 28, 2016
Credit: Carl Vondrick, MIT CSAIL
Credit: Carl Vondrick, MIT CSAIL

Living in a dynamic physical world, it’s easy to forget how effortlessly we understand our surroundings. With minimal thought, we can figure out how scenes change and objects interact.

But what’s second nature for us is still a huge problem for machines. With the limitless number of ways that objects can move, teaching computers to predict future actions can be difficult.

Recently, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have gotten a step closer, developing a deep-learning algorithm that, given still images from a scene, can create brief videos that simulate the future of that scene.

Trained on two million unlabeled videos that include a year’s worth of footage, the algorithm generated videos that human subjects deemed to be realistic 20 percent more often than a baseline model.

To be clear, at this point the videos are still relatively low-resolution and only 1-1.5 seconds in length. But the team is hopeful that future versions could be used for everything from improved security tactics to safer self-driving cars.

According to CSAIL PhD student and first author Carl Vondrick, the algorithm can also help machines recognize people’s activities without expensive human annotations.

“These videos show us what computers think can happen in a scene,” says Vondrick. “If you can predict the future, you must have understood something about the present.”

Vondrick wrote the paper with MIT professor Antonio Torralba and Hamed Pirsiavash, a former CSAIL postdoctoral associate who is now a professor at the University of Maryland, Baltimore County. The work will be presented at next week’s Neural Information Processing Systems (NIPS) conference in Barcelona.

How it works
Multiple researchers have tackled similar topics in computer vision, including MIT professor Bill Freeman, whose new work on “visual dynamics” also creates future frames in a scene. But where his model focuses on extrapolating videos into the future, Torralba’s model can also generate completely new videos that haven’t been seen before.

Credit: Carl Vondrick, MIT CSAIL
Credit: Carl Vondrick, MIT CSAIL

Previous systems build up scenes frame by frame, which creates a large margin for error. In contrast, this work focuses on processing the entire scene at once, with the algorithm generating as many as 32 frames from scratch per second.

“Building up a scene frame-by-frame is like a big game of ‘Telephone,’ which means that the message falls apart by the time you go around the whole room,” says Vondrick. “By instead trying to predict all frames simultaneously, it’s as if I’m talking to everyone in the room at once.”

Of course, there’s a trade-off to generating all frames simultaneously: while it becomes more accurate, the computer model also becomes more complex for longer videos.

To create multiple frames, researchers taught the model to generate the foreground separate from the background, and to then place the objects in the scene to let the model learn which objects move and which objects don’t.

The team used a deep-learning method called “adversarial learning” that involves training two competing neural networks. One network generates video, and the other discriminates between the real and generated videos. Over time, the generator learns to fool the discriminator.

From that, the model can create videos resembling scenes from beaches, train stations, hospitals, and golf courses.  For example, the beach model produced beaches with crashing waves, and the golf model had people walking on grass.

Testing the scene
The team compared the videos against a baseline of generated videos and asked subjects which they thought were more realistic. From over 13,000 opinions of 150 users, subjects chose the generative model videos 20 percent more often than the baseline.

To be clear, the the model still lacks some fairly simple common-sense principles. For example, it often doesn’t understand that objects are still there when they move, like when a train passes through a scene. The model also tends to make humans and objects look much larger in size than reality.

As mentioned before, another limitation is that the generated videos are just one and a half seconds long, which the team hopes to be able to increase in future work. The challenge is that this requires tracking longer dependencies to ensure that the scene still makes sense over longer time periods. One way to do this would be to add human supervision.

“It’s difficult to aggregate accurate information across long time periods in videos,” says Vondrick. “If the video has both cooking and eating activities, you have to be able to link those two together to make sense of the scene.”

These types of models aren’t limited to predicting the future. Generative videos can be used for adding animation to still images, like the animated newspaper from the Harry Potter books. They could also help detect anomalies in security footage and compress data for storing and sending longer videos.

“In the future, this will let us scale up vision systems to recognize objects and scenes without any supervision, simply by training them on video,” says Vondrick.

This work was supported by the National Science Foundation, the START program at UMBC, and a Google PhD fellowship.

Read the research paper.

by   -   November 28, 2016
A plant-moving robot from Billerica-based Harvest Automation. Source: harvestai/YouTube
A plant-moving robot from Billerica-based Harvest Automation. Source: harvestai/YouTube

To meet rising food demands from a growing global population, over 250 million acres of arable land will be needed – about 20% more land than all of Brazil. Alternatively, agricultural production will need to be more productive and more sustainable using our present acreage. Meeting future needs requires investment in alternative practices such as urban and vertical farming as well as existing indoor and covered methods.

Teach Xemo to Move
October 31, 2016

Subscribe to our calendars: calls / events

Are you planning to crowdfund your robot startup?

Need help spreading the word?

Join the Robohub crowdfunding page and increase the visibility of your campaign