Beyond Ordinary HyperQ®

A Q-learner Without State Limits

Q Learners are typically 2-dimensional matrices of reward. Given a state you have a vector of actions that result in a known reward. The course of action for that state is chosen using a nonlinear evaluation like Max or Min, depending on what your reward function does. The traditional limits of the Q learner are constrainted state management. Being a 2-D array of rewards, the state index is a 1-D array of indices that map to a state. This can quickly run out of space in memory and computability.

With the HyperQ you no longer have that single dimension of index lookup. The Q engine does the mapping for you so that you can give it any number of elements for your state. In the real world of applied machine learning the agents will have telemetry vectors that describe the state. These vectors can be used as the state for the HyperQ learner without having to reduce their fidelity in a hash/compression algorithm for doing the state mapping.

This algorithm for Q learning does not require a neural network or expensive GPU hardware. Your CPU is the only tool you need, and a bunch of memory.

We've applied this learner to a variety of problems, including the original Atari LEM game and the GYM.Net Lunar Lander game. Raw telemetry was used to solve the problems (not the 2D raster) and have been repeatedly successful. Read our white paper on the solution and its algorithm to learn more about the tool.

Q LEARNING WITHOUT LIMITS

This is a video from the solution to the classic GYM Lunar Lander environment.

LEM, Lunar Lander, and War3 are marks owned by Atari and various other authors. No claim of owership or affiliation is made in referencing these games.

Download The Beyond Ordinary HyperQ® Library

The library will automatically disable itself after 14 days unless you provide a license key to activate it. The activation occurs over the Internet.

Highly Capable and Extensible

  • Scalable to a variety of state representations.
  • C# and .NET compatible and capable of running without any GPU.
  • Supports Dyna replay.
  • Has positive/negative feedback memory selection.
  • Supports epsilon and alpha decay with customizable decay functions.
  • Extensible action selector with default min/max selectors.
  • Memory intensive library with marginal CPU requirements.
  • Requires Microsoft's .NET platform

Beyond Ordinary HyperQ Pricing

This is a subscription product that is paid yearly on a per-product integration basis. This product can not be embedded into a product that is given away for free. The library requires internet access to verify licensing. If it can not connect to our licensing servers then it will disable itself.

This is an Artificial Intelligence product that is regulated by the US Department of Commerce. This technology and any software or service derived from it is export controlled and cannot be distributed to regions that are on the export exclusion list.

  • .NET 4.8 or better, .NET 6/7/8
  • Licensed per-product integration (not per-developer)
  • $120,000 per year for a web delivered service, OR
  • $60,000 per year for an embedded external product, OR
  • $10,000 per year for an internal-use only product.
  • Want to add Dyna for your training? $5,000 per year additional cost
  • Wanna talk to a human? 50% of your licensed cost, per year, for that level of support.
  • Want the source code? $2,000,000 with limited usage rights