Skip to main content
idi
Toggle navigation
0
You have 0 notifications
Site Visitor
Site Visitor
New To Inknowvation.com?
Register now to get an access to proprietary SBIR-STTR databases!
Registration is fast and free - start your access to business-actionable information today!
Login
Site Register
SBIR-STTR Award
You are here:
Home
Search Databases
Search SBIR-STTR Awards
SBIR-STTR Award
2
Model Zero - Reinforcement Learning for Policy Transfer into High-Fidelity Environments
Award last edited on: 11/6/2023
Sponsored Program
SBIR
Awarding Agency
DOD : AF
Total Award Amount
$999,822
Award Phase
2
Solicitation Topic Code
AF212-D001
Principal Investigator
William Li
Company Information
Heron Systems Inc
22685 Three Notch Road Unit B
California, MD 20619
(301) 866-0330
bd@heronsystems.com
www.heronsystems.com
Location:
Single
Congr. District:
05
County:
St. Mary's
Phase I
Contract Number:
FA8750-22-C-0504
Start Date:
1/10/2022
Completed:
7/10/2023
Phase I year
2022
Phase I Amount
$1
Direct to Phase II
Phase II
Contract Number:
FA8750-22-C-0504
Start Date:
1/10/2022
Completed:
7/10/2023
Phase II year
2022
Phase II Amount
$999,821
Recent demonstrations of super-human performance leveraging reinforcement learning has shown the power of Artificial Intelligence (AI) for solving high dimensional complex problems through long-term decision making. However, most reinforcement learning approaches require hundreds of millions or even billions of training samples to achieve high performing policies. In this paper we propose a novel model based reinforcement learning algorithm call Model Zero that constructs a world model learned from a lower fidelity simulation and transfers the world model into a high fidelity simulation. Model Zero leverages its existing knowledge learned from a low fidelity simulation and continues to learn a more accurate world model after being transferred into the high fidelity simulation. A key advantage of using our model based reinforcement learning approach is that the methods are general and can be applied to various deterministic high-fidelity physics based environments where training samples are computationally expensive to obtain.
×
Login to your account
Mail sent successfully.
Enter any username and password.
Username
Password
Remember me
Login
Forgot your username?
Click here for assistance
Forgot your password?
Request new password
Don't have an account?
Sign up
Forgot username?
Mail sent successfully.
Enter username and password.
Please enter email address that is associated with your account.
Back
Submit
Still Need Help?
If you need further assistance, send us an
e-mail
and we will assist you in resetting your account.
Forgot password?
Mail sent successfully.
Enter username and password.
Please enter email address that is associated with your account.
Back
Submit
Still Need Help?
If you need further assistance, send us an
e-mail
and we will assist you in resetting your account.