From WWII Code Breakers to Smart Machines

by Darin Ellingson, Validation SME, Verista

Can your AI/ML systems pass the Turing Test?

Not long after cracking the Enigma Code during WWII, Alan Turing wrote a paper in 1950 called “Computing Machinery and Intelligence” where he asked a very simple question: Can machines think? He elaborated on the concept by posing a test, which he called the “imitation game” and later referred to as the Turing Test.

Can a machine exhibit intelligent behavior that is equivalent to and indistinguishable from a human? Technology has advanced dramatically since the 1950s, however no such machine currently exists that can pass the Turing Test in its fullest context. Computer science advancements in the areas of Artificial Intelligence (AI) and Machine Learning (ML) have brought us closer to realizing intelligent machines. We experience these technologies daily in the form of Siri or Alexa, self-driving cars, chat bots, and consumer preferential recommendations such as Netflix or Prime, yet we likely don’t even think about what made these items possible.

What is Artificial Intelligence?

Essentially, Artificial Intelligence is a segment of computer science devoted to creating smart machines that possess the ability to accomplish jobs that normally requires human intelligence and interaction.

The four types of Artificial Intelligence

Reactive Machines utilize the most basic principles of AI by perceiving and reacting to its environment based on a set of rules. It cannot store any “memories”, thus past experiences cannot be leveraged for decision making. These machines typically are designed to accomplish a limited set of tasks and perform those tasks repeatedly and reliably. IBM’s Big Blue is a prime example of a Reactive Machine. It was designed to play chess based on its rule set and could only perceive the pieces on the board and react using the most logical next move. Big Blue cannot predict future moves of its opponent, nor can it look at previous moves to determine strategy. It can only view and react to the board. Big Blue played matches against Chess Grandmaster Gary Kasparov in the mid 1990s, with Kasparov winning the first match in 1996 and Big Blue winning in 1997 which marked the first tournament level defeat of a champion by a machine.

Limited Memory stores data and predictions to utilize in decision making; in essence using past experiences to predict future occurrences. Limited Memory is used heavily in ML where teams of data scientists build the models and continuously teach that model how to process and utilize new data or use environments that automatically train models. The critical steps of Limited Memory ML are creation of training data, creation of the model, model making predictions, model receiving feedback, storing feedback as data, and the iteration all previous steps to continue to learn and evolve the model.

Theory of Mind is based on the psychological understanding that living beings possess thoughts and emotions that affect behavior. This type of AI essentially understands how other machines, humans, and animals feel and make decisions, then applying this knowledge to its own decision making. Technically speaking, this area of AI is still mostly fiction.

Self-Awareness is AI that has reached the level of possessing human consciousness. It understands its own existence, existence of others and their associated emotional states. Essentially, the machine has become a thinking, conscious being and can interpret interactions like a human and respond accordingly.

What is Machine Learning?

ML is a subsegment of AI which uses data and algorithms to “imitate” human learning. Using different data sets and algorithms, the machine gradually improves as it learns. As mentioned above, Limited Memory AI utilizes ML with 3 major models:

–       Reinforcement Learning – uses trial and error to learn how to make better decisions

–       Long Short-Term Memory – learns how to make predictions based on historic data. It also prioritizes recent data over older data for making predictions.

–       Evolutionary Generative Adversarial Networks – constantly evolving by exploring different paths based on previous decisions and experiences. These models use simulations, statistics, and pure chance to make predictions throughout its evolution.

What is Data Science?

Oracle defines Data Science as follows: Data science combines multiple fields, including statistics, scientific methods, AI, and data analysis, to extract value from data. Those who practice data science are called data scientists, and they combine a range of skills to analyze data collected from the web, smartphones, customers, sensors, and other sources to derive actionable insights.

Data Science has transformed business data using AI/ML into competitive advantages in the areas of forecasting (finances, resources, sales, etc.), supply chain optimization, fraud detection, predictive maintenance, and many other areas.

Why are Data Warehouses and Data Lakes critical to successful of AI/ML?

Traditional IT systems store data in dedicated databases in a very structured manner. Databases are suitable for analysis of small data sets, reports, queries, automation of processes, and data entry for specific systems. Data Warehouses are large storage locations used to aggregate data from multiple sources in a very structured manner. Data Lakes are similar to Data Warehouses; however the data is usually unstructured.

Lee Easton of AeroVision.io uses the tool analogy to describe the differences. Your data are your tools. Your toolbox is the database by providing structured storage of your tools. Your tool shed is the data warehouse where you may have multiple databases (toolboxes) storing different types of data (tools). Now what is a Data Lake? Remember that data is stored in an unstructured manner, so imagine dumping all your tools onto the floor of the tool shed and discard the toolboxes. This is a Data Lake: A large collection of disparate data stored together that applies no rules on what types of data can be stored.

Data Warehouses and Data Lakes are components of what is called Big Data. AI and ML rely on data to process through models to provide insights into data. ML relies on various sets of data to learn from and adjust models accordingly to conditions, rules, and algorithmic processing. AI in general is useless without data, and the more data available to process, the better the resulting AI. As AI advances using Big Data, the less human interaction is required to maintain the AI’s proper functionality. With less human intervention, we come closer to maximizing the potential of AI and Big Data.

If we dig deeper into the types of data created and collected by businesses, it’s estimated that 80-90% of all business data is unstructured and increasing annually by as much as 65% of all data collected. Prior to the advent of AI, unstructured data was mostly forgotten, and no tools existed to leverage this data. AI has allowed for data such as emails, text files, video files, audio files, web sites and blogs, and image files to be utilized for analysis and trending where it wasn’t possible prior to AI. Since the majority of newly generated business data is unstructured, it is critical for businesses to manage and analyze this data to guide decision making (data driven decisions) and gain competitive advantages.

What’s required to get started with AI and ML?

Choosing the appropriate solution can be daunting given the wealth of tools available for AI and ML. Below is simple framework to get you started.

·      Select a business case – choose a business area where you desire to gain deeper insights. Avoid overly complex cases as you are just getting started. Your models will require updates and improvements as you develop a solution, thus it’s best to start simple.

·      Staff appropriately – your staff will require training in AI if not already present in your organization, otherwise hiring Data Scientists trained in AI is a must.

·      Select an AI solution – your Data Scientists will help the business select the most appropriate tools to support the business case you are exploring. Keep it simple and expand as you improve the solution.

·      Improve the solution – with your simple solution in place, evaluate its performance and determine where changes and improvements are warranted. Share the solution with others within your organization to get feedback, then make adjustments as necessary. Adjust your algorithms. Try other tools. Change how data is cleansed and prepared for analysis. Do whatever makes sense to improve your solution.

·      Select new business cases – now that you have AI efforts ongoing with your first business case, select new business cases to pursue. Having AI is not a checkbox, it’s a continuously evolving process, thus adding new areas to explore will only increase insights into the data your business is generating.

Next steps for companies to continue their Pharma 4.0 enablement is to explore applications of AI technologies. The specific application we will explore in the part 4 of this series is Predictive Analysis and industry use cases.

Reach out to Verista to see how we can help you with your next Advanced Analytics project!