March 2004
Notes
The diagrams that were supposed to appear in the last two month's newsletters didn't display on most people's systems. The newsletters are archived at an alternate site, www.ainewsletter.com, that has copies of the issues with the pictures.
This month's newsletter has some further applications of the fuzzy logic system developed in the February newsletter. Contact me to get the the latest version of the fuzzy logic system and the full source for this month's examples.
Emergent Behavior
Emergent behavior refers to evolution-inspired computing algorithms that can be used to search for desired results. Here's some interesting work in the field.
Evolutionary Design and Lego(tm) Blocks
Pablo Funes of Brandeis University has done a lot of work in the area of evolutionary design. As part of his research he has built experimental systems that evolve structures of Lego(tm) blocks that meet various design goals.
The Lego(tm) blocks are an excellent experimental domain because they are modular units that are constrained almost entirely by the number of join points between two blocks. The model of the blocks then is a model of the joints, and how much stress a joint between two blocks can stand.
The evolutionary algorithm starts with a single brick and a design goal. The design goal might specify a height and a weight. In other words the expected result would be a table of sufficient size and support to hold the weight. The goal might also specify a position, so the result might be a cantilevered bridge.
The algorithm starts by adding a random brick to the starting brick and creating a number of test structures. The best are kept, the worst thrown out, and then each is further evolved. Standard genetic algorithms are used as the good structures are "mated" with parts of each going into offspring, with mutation appearing periodically to start new possibilities.
What's fun about this work is they could then build the resulting structures with real Legos(tm) to see if they worked. The thesis contains a number of pictures of these evolved structures. See links.
The Lego(tm) designer shows up in Brandeis' Dynamical & Evolutionary Machine Organization (DEMO) project and can be seen following the fun&games and EvoCAD links on that Web site (links). This is an interactive Java applet that lets you hand manipulate two-dimensional blocks and/or specify design goals and let the evolutionary algorithm finish the design.
Ant Colony Optimization (ACO)
Ant Colony Optimization (ACO) is a search algorithm inspired by the intelligence exhibited by ant colonies in finding food and building roads. Each ant can sense, via scent, what the other ants have done and add their own behavior to a general pattern. For example, when searching for food ants will leave a scent trail. Other ants will follow one trail or another, but ones that lead to food will tend to be repeated more often, leading to stronger scents. Further, other ants finding food via a shorter path will also leave scent trails, but because it is shorter will tend to get more traffic, thus drawing ants from other trails. In this way the shortest path to food is eventually found and a majority of ants will use it.
ACO is a formalization of this idea, where individual agents search about for possible solutions to a problem, leaving traces of where they have been and how successful. The algorithms can be used for travel planning and scheduling type applications, and have their pros and cons. The papers on Marco Dorigo's Web site, see links, provide a wealth of detailed information on the algorithms and their applications.
Fuzzy Logic and Health Risk
Last month's issue described the use of fuzzy logic for controllers, but hinted at its use in risk assessment, in particular with an eye towards coming up with a way to model on the serotonin syndrome caused by certain drug interactions.
Fuzzy logic has been used in financial systems to estimate risk. This makes sense when compared to other approaches such as certainty factors or Bayesian probability. The risk isn't really a probability or a degree of certainty, but rather, exactly a degree of risk, not unlike the degree of hotness or coldness in a shower based on how far the knob is turned.
Smoking, Exercise and Lung Cancer Risk
The same idea can be applied to health risk. I used my own medical knowledge, acquired by reading the newspapers to come up with a simple model to illustrate this idea. It's known that the risk of getting lung cancer increases with the amount of smoking a person does, and decreases with the amount of exercise a person gets.
Using the fuzzy logic test environment we developed in the last news letter, we can express these ideas as three fuzzy sets, and fuzzy rules relating the sets. The fuzzy sets for the variables, risk(lung_cancer), smoking, and exercise follow the same pattern we used for the shower, which was to have three sets for each, representing medium, high and low values.
The fuzzy rules are:
If smoking is low or exercise is high, then risk(lung_cancer) is low.
If smoking is medium or exercise is medium, then risk(lung_cancer) is medium.
If smoking is high or exercise is low, then risk(lung_cancer) is high.
Here's how they are coded in the fuzzy logic test environment developed in the February newsletter:
% from 0 to 100
fuzzy_set(risk(lung_cancer), [
variable(low) :: descending_line( 0, 50 ),
variable(medium) :: triangle( 25, 50, 75 ),
variable(high) :: ascending_line( 50, 100 )
]).
% packs per day
fuzzy_set(smoking, [
variable(low) :: descending_line( 0.0, 1.0 ),
variable(medium) :: triangle( 0.5, 1.0, 1.5 ),
variable(high) :: ascending_line( 1.0, 2.0 )
]).
% hours per week
fuzzy_set(exercise, [
variable(low) :: descending_line( 0.0, 7.0 ),
variable(medium) :: triangle( 3.0, 7.0, 11.0 ),
variable(high) :: ascending_line( 7.0, 14.0 )
]).
fuzzy_rules(risk(lung_cancer), [
do(low) :: smoking fzis low fzor exercise fzis high,
do(medium) :: smoking fzis medium fzor exercise fzis medium,
do(high) :: smoking fzis high fzor exercise fzis low
]).
The risk is represented as a number from 0 to 100, although because of the way the defuzzification works, it will never reach either of the end points. Here are three test runs showing a high and low risk scenario, and showing how someone who has one risk factor (no exercise) but not the other (doesn't smoke) winds up with a total risk exactly in the middle.
how many packs a day? 3
how many hours of exercise a week? 1
risk(lung_cancer) = 83.3333
how many packs a day? 0
how many hours of exercise a week? 12
risk(lung_cancer) = 16.6667
how many packs a day? 0
how many hours of exercise a week? 0
risk(lung_cancer) = 50
Serotonin Syndrome
The December newsletter discussed a system that could be used to predict drug interactions based on first principles, and used as an example the serotonin syndrome. That first version of the system modelled the various factors and interactions of the system, but used pure boolean variables for both factors and the risk of getting the serotonin syndrome. A better model for the data, one that captured the inherent uncertainty of the system, was left as a future project.
Now, fuzzy logic has found its greatest use in control systems, where it is used to model smooth gradations of complex inputs and outputs of some physical system. Drug interaction is the modelling of a biological system, with complex inputs and outputs, and fuzzy logic might be an excellent way to implement such a system.
The fuzzy logic system of last month's newsletter was used to try out the idea on the serotonin syndrome, a result of certain drug/condition interactions.
The rules from the doctor were that:
The serotonin syndrome is a risk if extra cellular serotonin is high and either serotonin removal is impaired or serotonin response is enhanced.
Extra cellular serotonin is high when certain drugs are consumed, of which alcohol is one.
Serotonin removal is impaired by certain drugs, of which citalopram is one.
Serotonin response is enhanced by certain drugs, of which buspiron is one.
The December newsletter included more of the factors and a better ontological model of drugs and effects, but to test the use of fuzzy logic, just the three drugs mentioned above were used.
The modelling of this system using fuzzy logic involved a fuzzy set and rules for risk(serotonin syndrome), and three subordinate fuzzy sets and rules for the pertinent factors of extra cellular serotonin, serotonin removal and serotonin response. As with the smoking/exercise/lung_cancer example, three fuzzy sets for low high and medium were used for each.
There is an immediate objection one might have to this approach, and that is that its completely ad hoc. The degree of risk and the degree of various inputs have been associated in a very arbitrary manner. This is true, so I went to the doctor working on this system, who is one of the leading researchers in the area, and tried to get a more concrete understanding of the relationships.
His response, in a nutshell, was a very educated and detailed "we don't know". In other words, the interactions are so complex and the research is so sparse, that it really is impossible to quantify the system in any kind of detailed manner. The best that probably can be done is to have a system that expresses the knowledge that the risk of the serotonin syndrome increases when alcohol is consumed. And that's exactly what the fuzzy logic system does.
Here are two runs of the system, where the amount of a given drug is given on a 0-10 scale, where 0 is none and 10 is a lot. As expected, the risk is higher when the factors are higher.
alcohol? 8
citalopram? 6
buspiron? 6
risk(serotonin syndrome) = 76.9942
alcohol? 2
citalopram? 2
buspiron? 2
risk(serotonin syndrome) = 18.3433
Here are the sets and rules used with the February's fuzzy logic tool set.
fuzzy_set(risk('serotonin syndrome'), [
variable(low) :: descending_line( 0, 50 ),
variable(medium) :: triangle( 25, 50, 75 ),
variable(high) :: ascending_line( 50, 100 )
]).
fuzzy_rules(risk('serotonin syndrome'), [
do(high) :: 'extra cellular serotonin' fzis high fzand
( 'serotonin removal' fzis low fzor 'serotonin response' fzis high),
do(medium) :: 'extra cellular serotonin' fzis medium fzand
( 'serotonin removal' fzis medium fzor 'serotonin response' fzis medium),
do(low) :: 'extra cellular serotonin' fzis low fzand
( 'serotonin removal' fzis high fzor 'serotonin response' fzis low)
]).
fuzzy_set('extra cellular serotonin', [
variable(low) :: descending_line( 0, 50 ),
variable(medium) :: triangle( 25, 50, 75 ),
variable(high) :: ascending_line( 50, 100 )
]).
fuzzy_set('serotonin removal', [
variable(low) :: descending_line( 0, 50 ),
variable(medium) :: triangle( 25, 50, 75 ),
variable(high) :: ascending_line( 50, 100 )
]).
fuzzy_set('serotonin response', [
variable(low) :: descending_line( 0, 50 ),
variable(medium) :: triangle( 25, 50, 75 ),
variable(high) :: ascending_line( 50, 100 )
]).
fuzzy_rules('extra cellular serotonin', [
do(high) :: alcohol fzis high,
do(medium) :: alcohol fzis medium,
do(low) :: alcohol fzis low
]).
fuzzy_rules('serotonin removal', [
do(high) :: citalopram fzis low,
do(medium) :: citalopram fzis medium,
do(low) :: citalopram fzis high
]).
fuzzy_rules('serotonin response', [
do(high) :: buspiron fzis high,
do(medium) :: buspiron fzis medium,
do(low) :: buspiron fzis low
]).
fuzzy_set('alcohol', [
variable(low) :: descending_line( 0.0, 5.0 ),
variable(medium) :: triangle( 2.5, 5.0, 7.5 ),
variable(high) :: ascending_line( 5.0, 10.0 )
]).
fuzzy_set('citalopram', [
variable(low) :: descending_line( 0.0, 5.0 ),
variable(medium) :: triangle( 2.5, 5.0, 7.5 ),
variable(high) :: ascending_line( 5.0, 10.0 )
]).
fuzzy_set('buspiron', [
variable(low) :: descending_line( 0.0, 5.0 ),
variable(medium) :: triangle( 2.5, 5.0, 7.5 ),
variable(high) :: ascending_line( 5.0, 10.0 )
]).
There are some difficulties with the system. It handles cases where the values correspond to high and low risks, but if there are balancing high and low factors the system does not find an answer because no fuzzy rule applies. For example, if there is a lot of extra cellular serotonin, but the serotonin removal is also high, the result should be some balanced risk in the middle, but instead there isn't a rule that covers that case. Better choices of fuzzy sets or a more complex rule set should resolve that problem.
Contact me for the current version of the full working system.
Announcements
Data Mining Guidance and Results for
the Data Rich - Yet Information Poor
DATA MINING: LEVELS I, II & III
[ Las Vegas, March | Wash-DC, June | San Diego, Sept ]
Learn how experts leverage AI technology to build and deploy predictive models by attending The Modeling Agency's vendor-neutral and application-oriented data mining courses. Participants will learn about data mining capabilities, limitations, methods, tools, techniques, applications, advantages and costly pitfalls.
Since The Modeling Agency is not a tools vendor, participants enjoy a balanced, broad, and non-promotional perspective of predictive analytics. Don't leave a powerful competitive advantage untapped: harness the patterns, drivers, interrelationships and profits hidden within your data.
FULL COURSE DETAILS AND REGISTRATION
Don't miss your chance to attend: reserve your space today:
http://www.the-modeling-agency.com/training
888-742-2454 (toll free)
936-321-2177 (direct)
Links
http://www.genetic-programming.org/ - A resource page for information about genetic programming.
http://www.genetic-programming.com/published/Salon081099.html - An article that mentions the use of genetic programming in the design of RoboCup competitors.
http://demo.cs.brandeis.edu/ - Brandeis home page for the DEMO, Dynamical & Evolutionary Machine Organization project. Follow the fun and games links to EvoCAD to see a Java applet demonstration of the evolutionary Lego(tm) designer. It helps to read the paper (next link) first to understand the nature of the simulation.
http://www.cs.brandeis.edu/~pablo/thesis/ - Pablo Funes graduate work is all about complexity and evolutionary design. His excellent Web site contains the text of his thesis, including details on the two-dimensional Lego(tm) block simulator.
http://iridia.ulb.ac.be/~mdorigo/ACO/ACO.html - Marco Dorigo has done much of the significant research on ant colony optimization (ACO). This is his Web site that contains numerous links to articles and other related information on ACO.
http://www.louderthanabomb.com/ - Louder! is a site dedicated to AI for games, but the products are also useful for education and other applications. They have an excellent fuzzy logic system that can be integrated with C++ to provide fuzzy services to the C++ application.
http://www.fuzzytech.com/binaries/ieccd1.pdf - The International Electrotechnical Commission's draft standard for fuzzy logic for use with programmable controllers. The final version can be purchased from the IEC.
http://ffll.sourceforge.net/fcl.htm - Louder!'s description of the IEC fuzzy logic standard, plus links to sources for the final versions.
Until next month.