Automation Control Systems
Introduction to Simultaneous Localization And Mapping (SLAM) for mobile robot. Navigational sensors used in SLAM: Internal, External, Range sensors, Odometry, Inertial Navigation Systems, Global Positioning System. Map processing and updating principle.
Рубрика | Иностранные языки и языкознание |
Вид | курсовая работа |
Язык | английский |
Дата добавления | 17.05.2014 |
Размер файла | 3,4 M |
Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже
Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.
Размещено на http://www.allbest.ru/
1. Analysis of problem and problem statement
1.1 Introduction to Simultaneous Localization And Mapping (SLAM) for mobile robot
The simultaneous localization and mapping (SLAM) problem aims to create a mobile robot that would be placed at an unknown location in an unknown environment and for the robot to incrementally build a consistent map of this environment while simultaneously determining its location within this map. The solution of the SLAM problem for mobile robots will provide means to make a robot truly autonomous.
Over the past decade the "solution" of the SLAM problem has been rather successful in the robotics community. SLAM has been formulated and solved as a theoretical problem in a number of different forms. SLAM has also been implemented in a number of different domains from indoor robots to outdoor, underwater, and airborne systems.
1.1.1 History of the SLAM Problem
The genesis of the probabilistic SLAM problem occurred at the 1986 IEEE Robotics and Automation Conference held in San Francisco, California. It was a time when probabilistic methods were only beginning to introduce into both robotics and artificial intelligence (AI). A number of researchers had been looking for application of estimation-theoretic methods to the methods of mapping and localization problems; these included Peter Cheeseman, Jim Crowley, Randal Smith, Raja Chatila, Oliver Faugeras and Hugh Durrant-Whyte. Over the course of the conference long discussions about consistent mapping took place. As the result of this conversation was recognition of consistent probabilistic mapping as a fundamental problem in robotics with major conceptual and computational issues that needed to be addressed. Over the next few years a number of key papers were produced. Work of Smith and Cheeseman [1] and Durrant-Whyte [2] established a statistical basis for describing relationships between landmarks and manipulating geometric uncertainty. A key element of this work was to show that there must be a high degree of correlation between estimates of the location of different landmarks in a map and that, indeed, these correlations would grow with successive observations.
At the same time Ayache and Faugeras [3] were undertaking early work in visual navigation, Crowley [9] and Chatila and Laumond [6] were working in sonar-based navigation of mobile robots using Kalman filter-type algorithms.
These two strands of research had much in common and resulted soon after in the landmark paper by Smith. This paper showed that as a mobile robot moves through an unknown environment taking relative observations of landmarks, the estimates of these landmarks are all necessarily correlated with each other because of the common error in estimated vehicle location. The implication of this was profound: A consistent full solution to the combined localization and mapping problem would require a joint state composed of the vehicle pose and every landmark position, to be updated following each landmark observation. In turn, this would require the estimator to employ a huge state vector (on the order of the number of landmarks maintained in the map) with computation scaling as the square of the number of landmarks.
Crucially, this work did not look at the convergence properties of the map or its steady-state behavior. Indeed, it was widely assumed at the time that the estimated map errors would not converge and would instead exhibit a random-walk behavior with unbounded error growth. Thus, given the computational complexity of the mapping problem and without knowledge of the convergence behavior of the map, researchers instead focused on a series of approximations to the consistent mapping problem, which assumed or even forced the correlations between landmarks to be minimized or eliminated, so reducing the full filter to a series of decoupled landmark to vehicle filters. Also for these reasons theoretical work on the combined localization and mapping problem came to a temporary halt, with work often focused on either mapping or localization as separate problems.
The conceptual breakthrough came with the realization that the combined mapping and localization problem, once formulated as a single estimation problem, was actually convergent. Most importantly, it was recognized that the correlations between landmarks, which most researchers had tried to minimize, were actually the critical part of the problem, and that, on the contrary, the more these correlation grew, the batter the solution. The structure of the SLAM problem, the convergence result and the coining the acronym SLAM was first presented in a mobile robotics survey paper presented at the 1995 International Symposium on Robotics Research [4]. Csorba [10], [11] developed the essential theory on convergence and many of the initial results. Several groups already working on mapping and localization, notably at the Massachusetts Institute of Technology [7], Zaragoza [8], [5] the ACFR at the Sydney [20], and others began working on SLAM - also called concurrent mapping and localization (CML) at this time - in indoor, outdoor, and subsea environments.
1.1.2 Spheres of application
SLAM applications now exist in a wide variety of domains. They include indoor, outdoor, aerial, and subsea. There are different sensing modalities such as bearing only and range only.
With SLAM techniques there are more than few possible application areas, e.g., rescue robots and service robots. One can imagine intelligent autonomous robots, which can map a destroyed environment, which is too dangerous for humans to explore, e.g., buildings after an earthquake. With the provided map, a human rescuer could locate possible victims and paths to the victims. On the other hand, there are more commercial applications like cleaning or security robots that need map of an office environment. The work on SLAM is interdisciplinary and integrates the best techniques from computer vision, statistics and informatics. To achieve the goal of localization and mapping, the common approach is to equip a mobile robot system with a wide range of sensors, e.g., laser scanners, odometry, vision, sonar or RFID-receivers. With a large array of sensors, a mobile robot can map the environment during indoor or outdoor exploration. If the robot has only access to a vision unit and odometry, however the SLAM problem becomes more challenging. In the literature this class of problem is called vision-only SLAM (VSLAM).
SLAM is an essential capability for mobile robots traveling in unknown environments where globally accurate position data (e.g. GPS) is not available. In particular, mobile robots have shown significant promise for remote exploration, going places that are too distant [25], too dangerous [26], or simply too costly to allow human access. If robots are to operate autonomously in extreme environments undersea, underground, and on the surfaces of other planets, they must be capable of building maps and navigating reliably according to these maps.
Even in benign environments, such as the interiors of buildings, accurate, prior maps are often difficult to acquire. The capability to map an unknown environment allows a robot to be deployed with minimal infrastructure. This is especially important if the environment changes over time.
The maps produced by SLAM algorithms typically serve as the basis for motion planning and exploration. However, the maps often have value in their own right. In July of 2002, nine miners in the Quecreek Mine in Sommerset, Pennsylvania were trapped underground for three and a half days after accidentally drilling into a nearby abandoned mine. A subsequent investigation attributed the cause of the accident to inaccurate maps [24]. Since the accident, mobile robots and SLAM have been investigated as possible technologies for acquiring accurate maps of abandoned mines. One such robot, shown in Figure 1.1(b), is capable of building 3D reconstructions of the interior of abandoned mines using SLAM technology [26].
1.1.3 Navigational sensors used in SLAM
Visual sensing is the most information-rich modality for navigation in everyday environments. Recent advances in simultaneous localization and map building for mobile robots have been made using sonar and laser range sensing to build maps in 2D and have been largely overlooked in the vision literature.
Given that mobile robots are required to operate in largely unstructured environments and are subject to uncertainty in their motion, they must have means of sensing the environment. As the vehicle moves, it must be able to anticipate and respond to dynamic changes in its environment. Sensing can also aid in navigation as it allows the vehicle to estimate its motion and to identify the state of the environment. Sensors play an important role in navigation and planning as they allow the vehicle to track recognizable features of the environment and to detect and respond to obstacles in its path.
Internal sensors
Internal sensors are those that provide information about the robot by monitoring its motion. Robot kinematic and dynamic equations can then be used in conjunction with this information to obtain an estimate of the position of the robot.
Odometry [12]
Wheel encoders can be mounted to track the rotation and velocity of the wheels of the vehicle. Shaft encoders can also be fixed to steering mechanisms to measure the steer angle of the wheels. These measured values can be integrated using an appropriate model to estimate the motion of the vehicle. A vehicle model attempts to capture the fundamental relationship between the vehicle's motion, its current state, x(t) and a given control input, u(t)
x(t) = f (x(t), u(t))(1.1)
An accurate vehicle model is an essential component of most navigation schemes. Vehicle models can be of varying degrees of complexity. Depending on the reliability with which the motion of the vehicle can be modeled and the sensors available for predicting the change of state of the vehicle, highly accurate estimates of vehicle motion can be generated. Ideally, of course, a vehicle model would capture the motion of the vehicle without any uncertainty and the vehicle state would be precisely predicted at all times. This is a practical impossibility; however, as the models used do not, in general, perfectly capture the vehicle motion and sensors and actuators are subject to noise that will slowly corrupt the accuracy of the state estimate. This form of position estimation is therefore reliable for short motions but accumulated errors introduced by the integration cause the estimates to drift from the true position of the vehicle unless they are periodically reset by external observations.
Inertial Navigation Systems
Inertial Navigation Systems (INS) [13] measure the accelerations and rotational rates experienced by a body. At the heart of these systems is an Inertial Measurement Unit (IMU) that normally consists of at least three accelerometers and three gyroscopes. By integrating the signals generated by these sensors, it is possible to estimate the motion of a vehicle in space. Once again, however, the integration of the signal causes the error in the estimate to grow without bound unless the estimate is periodically reset by external observations.
The rate at which the error in the estimated vehicle position grows is known as the drift. There is a great deal of variability in the quality and performance of Inertial Navigation Systems. The quality of the IMU is largely a function of the quality of the gyroscopes and to a lesser degree of the accelerometers. High quality ring laser gyroscopes typically have drift rates on the order of fractions of degrees per hour while cheaper fiber optic gyroscopes might drift up to tens of degrees per hour. This drift in the angular estimate of the unit also contributes to a rapid growth in error in the estimated position of the unit, as the gravity vector will not be properly aligned.
External sensors
As mentioned before, the accuracy of vehicle location estimates derived from information from the internal sensors gradually degrade due to the inherent noise in the sensors and inevitable inaccuracies present in the vehicle models used. External sensors that monitor the robot environment are therefore essential not only to make the robot control system aware of the state of its surroundings but also to correct the errors in the robot location estimates.
Global Positioning System [14]
The Global Positioning System (GPS) provides an important source of external information about the position of a receiver for outdoor navigation. A series of satellites orbiting the Earth provide range measurements between the satellite and the receiver. By knowing the locations of the satellites, it is possible to triangulate the position of a receiver by tracking signals received from the satellites that are within range. This information provides an external source of positioning that can be used to correct the information from inertial sensors.
As with most other sensors, there are a wide variety of GPS receivers currently available on the market and these vary significantly both in accuracy and price. Lower end GPS receivers can achieve accuracies on the order of tens of meters using a relatively small number of satellites. Differential GPS (DGPS) and Real Time Kinematic (RTK) units use a fixed base station to reduce noise in the signal, allowing the units to provide accuracies on the order of centimeters. The vehicle is equipped with an RTK receiver to provide accurate position fixes in open areas. It is important to note that the GPS produces erroneous location estimates in the presence of structures that reflect radio waves. Also when the view of the sky is obscured the accuracy that can be achieved is reduced.
Range sensors
An alternative method for obtaining an estimate of the vehicle location is to measure the relative position of a known object in the environment using a sensor mounted on the vehicle. There are a number of sensors available for sensing the range and bearing to reflectors in the environment [13]. Sonar, laser and radar have all been used in the deployment of mobile robotic vehicles. These sensors typically rely on emitting a pulse of energy and waiting for reflected energy from objects in the environment. It is possible to determine the distance to the object that returned the signal by measuring the time of flight between the sent and received signals. By rotating the transceiver head or providing some other scanning mechanism on these systems it is possible to scan the sensor through a range of bearings making it possible to build up a more detailed picture of the environment.
Sonar, laser and radar all use different wavelength of energy for the transmitted energy.
Each sensor has its inherent strengths and weaknesses and are therefore suitable for deployment in particular application domains. In air, sonar sensors have been used extensively in indoor mobile robotic systems due their low cost and ease of interfacing. Many units suffer from low accuracy in angular resolution and high incidences of specular reflections.
Sonars are also used extensively in underwater applications where the wavelengths used have much better range than energy at other frequencies. Scanning lasers have also proven to be popular sensors. They have gained popularity due to their ease of use, high range accuracy and good angular resolution. They have also come down significantly in price in recent years.
The performance of lasers can, however, suffer outdoors due to ambient light, dust, rain and other suspended particles in the air. Millimeter wave radar also has excellent range accuracy and has good angular resolution depending on the antenna selected. Radar is able to penetrate dust and rain, providing improved performance in these situations when compared with lasers. Radars are unfortunately a somewhat more expensive option than other range/bearing sensors.
Vision
Vision has received a considerable amount of attention over the years as a sensor for mobile robot navigation. Stereo pairs of cameras [15] can be used to generate depth information by triangulating the position of features relative to the image plane. Other work has concentrated on generating detailed three-dimensional models of the environment by recreating the structure of a scene using the estimated motion of the camera.
Table 1.1 - Comparison of different types of sensors
Sensor |
Accuracy |
Autonomous |
Price |
Operation conditions |
Disadvantages |
|
Odometry |
4?4 meters square path: smooth floor: 30 mm, 10 bumps: 500 mm |
Yes |
$30 and up |
Smooth floor |
Requires correction by other sensors |
|
Vision |
Depends on camera properties and lighting |
No, it requires the presence of landmarks |
$20 and up |
Daytime, otherwise requires additional source of light |
Requires presence of markers, can make fail detection |
|
Inertial Navigation System (INS) |
Position drift rate: 1-8 cm/s depending on frequency of acceleration change |
Yes |
$25000-$1000000 |
Non effective if there is often system velocity change |
Decrease in accuracy with time |
|
Global Positioning System (GPS) |
order of 20 m during motion, order of a few meters when standing for minutes |
Yes |
$150-$5000 |
Earth area |
Signal may black out |
|
Range sensors |
Accuracy depends on environment, where sensor is used |
No, it requires the presence of landmarks |
$30 and up |
In wide variety of domains |
Applicable only for certain type of environment |
1.1.4 Coordinate systems used in SLAM
Important competence for a mobile system is the ability to build and maintain maps of initially unknown environments. Maps allow the vehicle to plan its movement within the environment in order to achieve its goals. The automatic creation of maps allows the vehicle to model the world using its sensory resources. In some instances, the creation of an accurate map of the environment is a goal in itself while in many other circumstances the maps are used as a tool for performing other higher level tasks such as planning to achieve a goal or coordinating the motion of multiple robots. Given accurate positioning information, the creation of a map of the environment is a straightforward undertaking. Given a current position estimate and an observation of the environment the observation can be fused into the existing map to create an updated map estimate. One of the simplest methods for generating a map is to divide the environment into a grid consisting of cells whose properties are recorded. These grid maps can be used to indicate whether areas of the environment are occupied by obstacles [18]. In outdoor environments, grid cells can also encode other properties of the terrain such as traversibility. Grid maps become computationally expensive for large environments. Feature maps, on the other hand, rely on the extraction of relevant environmental features with which to build maps [16, 17]. Most feature maps consist of a set of landmarks and encode some relationship between the features contained in the maps.
Much work has been done to improve the computational complexity of the classic SLAM problem. It was not convenient to update the global map, as it required complex processing, that's why the submapping principle was introduced. Submapping has become highly popular as a way of tackling large-environment SLAM. The aim was to sufficiently reduce the complexity by operating within a smaller submap. The submaps could possibly be merged into the global map after a period of time or alternatively the submaps could remain separated.
Figure 1.2 - Map processing and updating principle, used in SLAM
1.2 Design of system of visual SLAM
VSLAM consists of vehicle, turn device, Personal Computer (processing unit and memory) and is equipped with a single camera, used as a vision sensor, and odometry sensors providing dead reckoning data. As odometry sensors standard wheel encoders attached to the driving wheels of a differential drive system may be used.
Figure 1.3 - Operational principle of VSLAM
VSLAM enables localization and mapping single low-cost camera and dead reckoning to make visual measurements of the environment. This technology is a major achievement in autonomous navigation, enabling manufacturers to develop a new generation of useful robotic products and devices with more intelligence than previously possible. With VSLAM, a robot or device captures visual landmarks and uses them to build a map of its environment. As the product moves through the area it has mapped, it navigates based on recognized visual landmarks on its map.
When entering a new environment, the VSLAM-enabled product can either load an existing map of that area or start creating a new map. The product then continues to move, detect old landmarks, create new landmarks, and correct its position information as necessary using its map.
The navigation object is the robot, which has four wheels and differential control. Speed is 5sm/sec.
The visual sensor is represented by web camera. It is situated in the center of the robot (figure 1.4).
Figure 1.4 - Position of camera on the robot body
simultaneous localization mapping navigational
Camera works all the time. Robot moves and investigates the environment continuously. It processes the obtained pictures of environment every 5 seconds. If robot finds the object (or image), that is similar (by some characteristics) to landmark, which is stored in robot memory, it must check this object. Then the image is scanned and compared with the true landmark using some image mapping methods.
Requirements to developed VSLAM are:
- to distinguish color marker (or landmark) among other visual objects;
- to determine the distance to the color marker (or landmark);
- having the information on vehicle location to determine the coordinates of the color marker (or landmark);
- update a stored map with the new location information and landmarks;
- organize submaps into one global map.
1.3 Problem statement of development of localization algorithm
The goal of this research project is to prepare and design an outdoor mobile robot, which will use the principles of VSLAM. So that it can be placed into an unknown environment, build the map of this environment and navigate according to this map. In SLAM, both the trajectory of the platform and the location of all landmarks are estimated online without the need for any a priori knowledge of location.
For this purpose we use two types of sensors: odometry and visual.
Odometry is the most widely used navigation method for mobile robot positioning; it provides good short-term accuracy, is inexpensive, and allows very high sampling rates. However, the fundamental idea of odometry is the integration of incremental motion information over time, which leads inevitably to the unbounded accumulation of errors. Odometry is based on simple equations, which hold true when wheel revolutions can be translated accurately into linear displacement relative to the ground.
Visual sensors are applied for segmentation of picture and its further morphological analyses. The result of this analysis is the separation of closed color objects, and marking some of them as color landmarks. Color landmarks are considered to be round closed spots of a certain size and color. Then with the help of visual sensors and some calculations the distance to the object is determined, knowing the parameters and characteristics of camera.
Processing the information, obtained from sensors, it is possible to create VSLAM, and obtain fully autonomous robot.
Also we have to take into account that noisy sensors make VSLAM to be a difficult problem. Every time the robot moves, it relies on its sensors to compute the distance traveled. Since there is no such thing as a noise-free sensor, there is always a small error associated with every move. These errors add up very quickly and the robot becomes lost. Correcting these errors is easy if a good map of the robot's operating environment is given; the robot can correct its estimated position by computing its relative position to landmarks that are present in the map and currently visible.
The error that arises due to odometry sensors will be compensated by means of visual sensors. So these sensors verify each other in order to eliminate error.
2. Review of possible solutions
2.1 Analysis of sensors used in SLAM
2.1.1 Vision sensors
Vision sensors produce very rich data and many algorithms from the computer vision community are ready to use and can be directly applied in robotic applications. Moreover vision sensors bring together properties which are of main importance for SLAM. They are low-cost sensors, small and power-saving and are therefore easy to embed on a robotic platform. Also they perceive data with a large field of view, at high frame rates and high resolutions.
One must differentiate monocular vision and stereo vision [15, 17]. A stereo head is composed of two cameras; with these two images 3D data of the surrounding environment can be recovered. But stereo-vision suffers from two main problems:
· Depending on the baseline of the stereo-bench, 3D information is obtained up to a given depth; this is an intrinsic limit of this sensor. For instance, it is very difficult to endow an aerial robot with a stereo-bench which baseline is large enough to get 3D information on the ground while flying at a reasonable altitude.
· A stereo-bench must be calibrated so as to produce correct 3D information, and when it is used on a vehicle it is prone to lose its calibration.
For these reasons, monocular vision was chosen to develop my VSLAM system.
2.1.2 Infrared (IR) range sensors
In general IR sensors are small devices that can emit an infrared light and sense the presence of an object using the IR-beam reflected from the object itself [22]. Considering the robotics field and especially mobile robotics, IR sensors have often been used as proximity sensors, with the task of detecting obstacles.
Apparent motion is detected when an infrared source with one temperature, such as a human, passes in front of an infrared source with another temperature, such as a wall.
Typically a sensor is composed of a sender and a receiver. Infra-red sensors use triangulation and a small linear CCD array to compute the distance and/or presence of objects in the field of view.
The basic behaviour of the Sharp IR sensor is as follows:
– A pulse of IR light is emitted by the emitter
– This light travels out in the field of view and either hits an object or just keeps on going
– In the case of no object, the light is never reflected and the reading shows no object
– If the light reflects off an object, it returns to the detector and creates a triangle between the point of reflection, the emitter, and the detector
– The angles in this triangle vary based on the distance to the object
– The receiver portion of the detector is a precision lens that transmits the reflected light onto various portions of an enclosed linear CCD array based on the angle of the triangle described above
– The CCD array can then determine what angle the reflected light came back at and therefore, it can calculate the distance to the object.
2.1.3 Ultra-sound range sensors
Ultrasound sensors [9] work on a principle similar to radar or sonar which evaluates attributes of a target by interpreting the echoes from radio or sound waves respectively. Ultrasound sensors generate high frequency sound waves and evaluate the echo which is received back by the sensor. Sensors calculate the time interval between sending the signal and receiving the echo to determine the distance to an object.
Transmit a packet of (ultrasonic) pressure waves. Distance d of the echoing object can be calculated based on the propagation speed of sound c and the time of flight t as follows:
(2.1)
The speed of sound c (340 m/s) in air is given by
(2.2)
where:
г - ratio of specific heats
R - gas constant
T - temperature in degrees Kelvin
Figure 2.1 - Wave processing by ultra-sound range sensor
Typical frequency: 40 - 180 kHz. Sound beam propagates in a cone like manner:
- Opening angles around 20 to 40 degrees
- Regions of constant depth
- Segments of an arc (sphere for 3D)
Figure 2.2 - Distribution of wave by ultra-sound range sensor
Other problems for ultrasonic sensors:
- Soft surfaces that absorb most of the sound energy
- Surfaces that are far from being perpendicular to the direction of the sound - specular reflection.
Figure 2.3 - The SRF04 ultrasound sensor
2.1.4 Other sensors
Odometry sensors
Odometry is the most widely used navigation method for mobile robot positioning; it provides good short-term accuracy, is inexpensive, and allows very high sampling rates [12]. However, the fundamental idea of odometry is the integration of incremental motion information over time, which leads inevitably to the unbounded accumulation of errors. Specifically, orientation errors will cause large lateral position errors, which increase proportionally with the distance travelled by the robot. Despite these limitations, most researchers agree that odometry is an important part of a robot navigation system and that navigation tasks will be simplified if odometric accuracy can be improved.
Magnetic compasses
Vehicle heading is the most significant of the navigation parameters (x, y, and И) in terms of its influence on accumulated dead-reckoning errors. For this reason, sensors which provide a measure of absolute heading are extremely important in solving the navigation needs of autonomous platforms. The magnetic compass is such a sensor. One disadvantage of any magnetic compass, however, is that the earth's magnetic field is often distorted near power lines or steel structures [Byrne et al., 1992]. This makes the straightforward use of geomagnetic sensors difficult for indoor applications. Based on a variety of physical effects related to the earth's magnetic field, different sensor systems are available [19]:
· Mechanical magnetic compasses.
· Fluxgate compasses.
· Hall-effect compasses.
· Magnetoresistive compasses.
· Magnetoelastic compasses.
The compass best suited for use with mobile robot applications is the fluxgate compass. When maintained in a level attitude, the fluxgate compass will measure the horizontal component of the earth's magnetic field, with the decided advantages of low power consumption, no moving parts, intolerance to shock and vibration, rapid start-up, and relatively low cost. If the vehicle is expected to operate over uneven terrain, the sensor coil should be gimbal-mounted and mechanically dampened to prevent serious errors introduced by the vertical component of the geomagnetic field.
Figure 2.4 - The C-100 fluxgate compass engine
2.2 Analysis of feature extraction methods
2.2.1 Color spaces
RGB (red, green, and blue) refers to a system for representing the colors to be used on a computer display [19]. Red, green, and blue can be combined in various proportions to obtain any color in the visible spectrum. Levels of R, G, and B can each range from 0 to 100 percent of full intensity. Each level is represented by the range of decimal numbers from 0 to 255 (256 levels for each color), equivalent to the range of binary numbers from 00000000 to 11111111, or hexadecimal 00 to FF. The total number of available colors is 256 x 256 x 256, or 16,777,216 possible colors.
Figure 2.5 - RGB Color space
CMYK (short for cyan, magenta, yellow, and key (black), and often referred to as process color or four color) is a subtractive color model, used in color printing, also used to describe the printing process itself. Though it varies by print house, press operator, press manufacturer and press run, ink is typically applied in the order of the acronym. The CMYK model works by partially or entirely masking certain colors on the typically white background (that is, absorbing particular wavelengths of light). Such a model is called subtractive because inks "subtract" brightness from white.
In additive color models such as RGB, white is the "additive" combination of all primary colored lights, while black is the absence of light. In the CMYK model, it is just the opposite: white is the natural color of the paper or other background, while black results from a full combination of colored inks. To save money on ink, and to produce deeper black tones, unsaturated and dark colors are produced by substituting black ink for the combination of cyan, magenta and yellow.
Figure 2.6 - CMYK Color model
A Lab color space stands for Luminance (or lightness) and A and B (which are chromatic components). According to this model A ranges from green to red, and B ranges from blue to yellow. This model was designed to be device independent. In other words by means of this model you can handle colors regardless of specific devices (such as monitors, printers, or computers). The Luminance ranges from 0 to 100, the A component ranges from -120 to +120 (from green to red) and the B component ranges from -120 to +120 (from blue to yellow).
Unlike the RGB and CMYK color models, Lab color is designed to approximate human vision. It aspires to perceptual uniformity, and its L component closely matches human perception of lightness. It can thus be used to make accurate color balance corrections by modifying output curves in a and b components, or to adjust the lightness contrast using the L component. These transformations are difficult or impossible in the RGB or CMYK spaces, which model the output of physical devices, rather than human visual perception.
RGB |
Red, Green and Blue |
RGB model creates colors by adding light; it is called an additive color model. Art of reproducing color by adding 3-primary RGB-colors in varying proportions is called ADDITIVE mixing. |
|
CMYK |
Cyan, Magenta, Yellow and Black |
CMYK color model creates colors by absorbing light; it is called a subtractive color model. CMYK-model is narrower than RGB. CMYK are the 3-primary colors used in SUBTRACTIVE mixing. When CMY-colors are mixed, it produces BLACK or COMPOSITE BLACK (CMYK). |
|
LAB |
Luminance and two chromatic components: A - green to magenta B - blue to yellow |
LAB or Lab (CIE L*a*b) is a color model created by the Commission Internationale de l'Eclairage (CIE). It contains a luminance (or lightness) component (L) and two chromatic components: "a" (green to red (or magenta)) and "b" (blue to yellow). The Lab color space is based on the Lab color model. |
2.2.2 Color segmentation methods
A central problem, called segmentation is to distinguish objects from background. Image segmentation refers to the partitioning of image pixels into meaningful regions. Segmentation is often the first step in image analysis. For intensity images (i.e., those represented by point-wise intensity levels) four popular approaches are: threshold techniques, edge-based methods (or boundary-based segmentation rely on region edges for segmentation), region-based techniques (or region-based segmentation rely on interior pixels for segmentation), and connectivity-preserving relaxation methods.
Threshold techniques, which make decisions based on local pixel information, are effective when the intensity levels of the objects fall squarely outside the range of levels in the background. Because spatial information is ignored, however, blurred region boundaries can create havoc.
This technique is based upon a simple concept [21]. A parameter is called the brightness threshold is chosen and applied to the image a [m, n] as follows:
This version of the algorithm assumes that there is an interest in light objects on a dark background. For dark objects on a light background the following algorithm is used:
The output is the label "object" or "background" which, due to its dichotomous nature, can be represented as a Boolean variable "1" or "0".
Figure 2.8 - Color segmentation method, using threshold techniques
Edge-based methods center around contour detection: their weakness in connecting together broken contour lines make them, too, prone to failure in the presence of blurring.
A region-based method usually proceeds as follows: the image is partitioned into connected regions by grouping neighboring pixels of similar intensity levels. Adjacent regions are then merged under some criterion involving perhaps homogeneity or sharpness of region boundaries. Overstringent criteria create fragmentation; lenient ones overlook blurred boundaries and overmerge. Hybrid techniques using a mix of the methods above are also popular.
Steps of region-based method procedure:
· Regions have already been defined before.
· A complete segmentation of an image is a finite set of regions,
if ij
where S is the total number of regions in the image.
· Further assumptions needed in this section are that regions must satisfy the following conditions:
for i=1, …, S and
for ij and is adjacent to ,
where is a binary homogeneity evaluation of the region.
· Resulting regions of the segmented image must be both homogeneous and maximal, where by 'maximal' we mean that the homogeneity criterion would not be true after merging a region with any adjacent region.
· While the region growing methods discussed below deal with two-dimensional images, three-dimensional implementations are often possible. Considering three-dimensional connectivity constraints, homogeneous regions (volumes) of a three-dimensional image can be determined using three-dimensional region growing.
A connectivity-preserving relaxation-based segmentation method, usually referred to as the active contour model, was proposed recently. The main idea is to start with some initial boundary shapes represented in the form of spline curves, and iteratively modify it by applying various shrink/expansion operations according to some energy function. Although the energy-minimizing model is not new, coupling it with the maintenance of an "elastic" contour model gives it an interesting new twist. As usual with such methods, getting trapped into a local minimum is a risk against which one must guard; this is not easy task.
2.2.3 Geometric feature determination methods
Geometric features determination method is used together with the color segmentation method in order to provide results with higher accuracy. This method is based on primary knowledge of geometric features of landmarks. So, to perform such type of analyses markers should be chosen with distinctive shape (like circle, square, triangle etc.). In our investigation was chosen circular shape of markers for further use of such characteristics as eccentricity, area, mass center.
It is very convenient to process image in binary (or grayscale) mode, as image looses the information about the colored features of environment, but edges of all elements, which are present on the image, are well distinguished. This binary image can be easily obtained after image segmentation and further processing, supposing that white color corresponds to the current color of object and black one corresponds to background.
In order to receive higher accuracy of the investigation simple method of noise filtering has to be applied. This method grounds on the morphological analyses of objects, considering that the area of the object is higher than the boundary value of noise (this boundary value can be set in the program). The area S of objects is calculated by the summation of all pixels, with white color. . In order to calculate eccentricity for ellipse, described by equation , I used the next expression . Where, in particular, for circle eccentricity equal to 0 and for ellipse eccentricity lies in the range between 0 and 1.
For each object there exists a label matrix, with such object parameters as index, area, eccentricity, orientation (relatively to x axes of image), mass center coordinates (in terms of x,y image), color index of object. Parameters, mentioned above, are checked for the correspondence to the set object's parameters. In the case disparity objects are considered to be error, and are filtered. If during some period of time at least one parameter keeps it value stable, the object is said to be such that can be used during the visual positioning.
2.3 Analysis of localization methods
2.3.1 Extended Kalman Filter (EKF)
The Kalman filter is an efficient recursive filter that estimates the state of a dynamic system from a series of incomplete and noisy measurements. This means
that only the estimated state from the previous time step and the current measurement are needed to compute the estimate for the current state. In contrast to batch estimation techniques, no history of observations and/or estimates is required.
While the EKF has become the dominant approach to SLAM, it suffers from two problems that complicate its application in large, real-world environments: quadratic complexity and sensitivity to failures in data association.
The first drawback of the EKF as a solution to the SLAM problem is computational complexity. Both the computation time and memory required by the EKF scale quadratically with the number of landmarks in the map [23, 4, 10], limiting its application to relatively small maps. Quadratic complexity is a consequence of the Gaussian representation employed by the EKF. The uncertainty of the SLAM posterior is represented as a covariance matrix encoding the correlations between all possible pairs of state variables. In a two-dimensional world, the covariance matrix contains 2N + 3 by 2N + 3 entries, where N is the total number of landmarks in the map. Thus, it is easy to see how the memory required to store this covariance matrix grows with N2.
The second problem with EKF-based SLAM approaches is related to data association, the mapping between observations and landmarks. In the real world, the associations between observations and landmarks are hidden variables that must be determined in order to estimate the robot pose and the landmark positions.
The standard approach to data association in EKFs is to assign every observation to a landmark using a maximum likelihood rule; i.e. every observation is assigned to the landmark most likely to have generated it. If the probability of an observation belonging to an existing landmark is too low, it is considered for inclusion as a new landmark. Since the EKF has no mechanism for representing uncertainty over data associations, the effect of incorporating an observation given the wrong data association can never be undone. If a large number of readings are incorporated incorrectly into the EKF, the filter will diverge.
The computational and memory requirements of the EKF make this approach infeasible for the SLAM problem.
2.3.2 Rao-Blackwellized Filter
Murphy, Doucet, and colleagues [27] introduced Rao-Blackwellized particle filters (RBPF's) as an effective means to solve the SLAM problem. The main problem of Rao-Blackwellized particle filters lies in their complexity, measured in terms of number of particles required to learn an accurate map. Either reducing quantity or improving the algorithm so that it is able to handle large sample sets is one of the major challenges for this family of algorithms.
The key idea of the Rao-Blackwellized particle filter for SLAM is to estimate the joint posterior p(x1:t ,m | z1:t, u1:t-1) about the trajectory x1:t = x1, . . . , xt of the robot and the map m of the environment given the observations z1:t = z1, . . . , zt and odometry measurements u1:t-1 = u1, . . . , ut-1. It does so by using the following factorization
(2.3)
In this equation, the posterior p(x1:t | z1:t , u1:t-1) is similar to the localization problem, since only the trajectory of the vehicle needs to be estimated. This estimation is performed using a particle filter which incrementally processes the observations and the odometry readings as they are available. The second term p(m | x1:t , z1:t ) can be computed efficiently since the poses x1:t of the robot are known when estimating the map m. As a result of the factorization, a Rao-Blackwellized particle filter for SLAM maintains an individual map for each sample and updates this map based on the trajectory estimate of the sample upon "mapping with known poses".
2.3.3 Conclusions
After thorough investigation of all possible solutions the web-camera A-4 Tech PK-635 was chosen as a vision sensor, because of its technical characteristics: sufficiently high resolution, capture speed and low cost.
In order to process the image methods of image segmentation and of geometric feature extraction were applied, as combination of these methods gives high accuracy of investigation.
After analyses of EKF and RBPF I made a conclusion that these filters are complex for making the analyses of surrounding environment and require powerful computer complex, that's why I decided to use range measurement algorithm, in order to localize and position robot in surrounding environment.
3. Development of visual system of SLAM
3.1 Selection of technical means of developed system
3.1.1 Vision sensor used in developed system
Web-camera description
Figure 3.1 - Web-camera A-4 Tech PK-635, 350К pixels
Company Name: A4 Technology
Model: A4Tech PK-635 Camera Driver Windows 9x/ME/2000/XP/2003
Operating System: Windows 9x/ME/2000/XP/2003 Description: A4Tech PK-635 Camera Driver Windows 9x/ME/2000/XP/2003. Design: Movable protective cover of the objective, flexible support Viewing angle is 54°
Sensor CMOS, 300K pixels
Capture rate is 30 fps
Exposure is automatic
Button: button of the snap receiving
Resolution is 640?480
Distance to the object - from 10 cm
Interface - USB 1.1
Built-in microphone: is present
3.1.2 On-board computer selection
There are the hardware and software requirements for the given project:
1. CPU = 1.666Hz
2. Memory = 128Mb
3. Videocard = 64Mb
4. Monitor (17")
5. Web-camera (robot onboard)
6. Operating system = Windows 2000/XP
7. LPT-port must be present and work in computer.
8. Compiler of language C in Matlab
3.2 Development of color segmentation algorithm
3.2.1 Algorithm of image color segmentation
The first step is to classify each pixel using the Nearest Neighbor Rule. Each color marker now has 'a' and a 'b' value. It is necessary to classify each pixel in the image by calculating the Euclidean distance between that pixel and each color marker. The smallest distance will show that the pixel most closely matches that color marker. For example, if the distance between a pixel and the second color marker were the smallest, then the pixel would be labeled as that color.
After that the results of nearest neighbor classification can be displayed. For this the label matrix is created and it contains a color label for each pixel in the image. We use the label matrix to separate objects in the original image by color.
The next step is to check and analyze the reaction of our system on different colors using the values of color markers.
Figure 3.2 - Block diagram of image color segmentation algorithm
Then we must calculate the object's perimeter and area, and using these information compute the roundness metric. After that we set the threshold of metric, and compare metric of all closed objects specified metric in order to exclude objects that are not related to the markers. It is necessary to remove area, which has the form different to circular one.
The next step is to remove closed objects with area less than set threshold that are considered to be noises.
3.2.2 Investigation of factors, which influence on color index
As it was mentioned before, L*a*b color model grounds on 3 components of color: "L" (which is luminance or lightness), "a" (chromatic component green to magenta), and "b" (chromatic component blue to yellow). It was supposed that luminance component determines lightness of the current image, and doesn't contain information about the color parameters. Based on this assumption the segmentation algorithm was developed, which illustrated the next results (segmentation was made for red, green and yellow colors by means of "a" and "b" color components).
Figure 3.3 - Snapshot, made by web camera
Figure 3.4 - Results of segmentation for red, green and yellow colors, considering "a" and "b" components of color
It is visible from received results that segmentation procedure doesn't give the required results and accuracy.
The next step was to analyze histogram of color distribution in the current image.
Figure 3.5 - Histogram of color distribution at the image, represented by figure 15
It is visible from the histogram that the greatest number of pixels belongs to background (white and grey colors), and there are 5 colors, which are above threshold value (that is 500 pixels). These colors are: red, yellow, green, pink and blue. Also there are 2 colors, which are below threshold value, which means that the number of pixels of these colors is very low.
During the investigation, it was noticed that for the same color marker lightness component changes with the change in background color. This means that lightness has also to be taken into account when determining color index. The following table partially shows the results, obtained during the investigation.
Table 3.1 - Values of color components for red, green and yellow colors at different backgrounds
Background |
RED |
GREEN |
YELLOW |
|||||||
L |
a |
b |
L |
a |
b |
L |
a |
b |
||
White |
83 |
167 |
159 |
204 |
86 |
181 |
245 |
110 |
207 |
|
Black |
143 |
177 |
162 |
211 |
83 |
181 |
246 |
112 |
195 |
|
Red |
97 |
165 |
157 |
217 |
72 |
172 |
237 |
97 |
192 |
|
Orange |
99 |
168 |
155 |
190 |
74 |
172 |
229 |
93 |
185 |
|
Mixed |
104 |
178 |
167 |
222 |
82 |
167 |
226 |
106 |
195 |
That's why the next step was to add "L" component and to consider segmentation, having information not only about red, green and yellow colors, but also about pink, blue and 2 colors of background (white, grey). Results were the next:
Figure 3.6 - Results of segmentation for red, green, yellow, pink, blue, white and gray colors, considering "a", "b" and "L" components of color
From results, illustrated in figure 3.6 we can make a conclusion that for the best segmentation results we have to consider a number of colors, present at the image, and quantity of each color, i.e. how many pixels of each color are present at the current image. After 20 experiments, similar to those, represented at figures 3.3-3.6, it was made a conclusion that it is highly recommended to define a number of colors, which exceed the threshold value in the histogram of color distribution, i.e. colors which are mostly present at the current image. And, considering this information, perform image segmentation by previously defined colors. This principle will significantly improve accuracy of segmentation, and, as a result, decrease error during positioning. As it was illustrated at the figure 3.4, accuracy of segmentation is rather poor, as all colors, which are above the threshold value, are not considered. The figure 3.6 shows the results of segmentation, which takes into account all colors, above threshold value. And, comparison of obtained results proves that accuracy of segmentation for figure 3.6 is much higher than for figure 3.4.
Подобные документы
The computer systems and unique possibilities for fulfillment before unknown offenses. The main risks and threats to information systems security in the internet. Internet as a port of escape of the confidential information and its damage minimization.
контрольная работа [19,6 K], добавлен 17.02.2011What is Climate. Science is the search for knowledge. Records changes of Climate for period million years. The activity Modern Climate Systems. What is the Greenhouse Effect. The past and current trends in climate change. The way to solve the problem.
презентация [8,3 M], добавлен 21.02.2011The basis of the study of economic systems of the countries of the world. Description of the administrative command system. Estimation of the market system and its implementation by the countries. Post-industrial society as a modern economic system.
реферат [30,3 K], добавлен 24.03.2014Introduction of geographic location, climatic conditions of Great Britain, its political and economic systems. History of the British Kingdom: decision Magna Carta, Industrial Revolution, the first census, the introduction of a democratic regime.
реферат [36,2 K], добавлен 04.10.2010Economic system. List by hands-on and hands-off. Types of economic systems. Market economy. Mixed economy. Planned economy. Traditional economy. Participatory economics. Changes of an economic situation in Russia. Recent economic developments.
реферат [15,0 K], добавлен 27.05.2007The Science of Economics, Macroeconomics and Microeconomics, The Future of Economics, Economic Systems: Two Important Distinctions. Free-Market Medicine in Russia. Competitive Advantage and the Small Firm Sector. Do People Volunteer to be Unemployed.
учебное пособие [1023,0 K], добавлен 11.12.2011Testing as a form of control the level of students’ skills and abilities. The classification of tests in learning English. Problems in the control of education. The final results on the evaluation system of knowledge. The main stage of the lesson.
курсовая работа [37,8 K], добавлен 08.05.2014Wireless LANs will provide the first layer of connectivity between mobile users and the global information infrastructure. Wireless application design, data reduction, caching, active documents, dynamic URLs. Interworking units for wireless connectivity.
реферат [23,7 K], добавлен 07.10.2010Steps of education in the USA: a four-year high school and eight-year elementary school. Separation of the higher education: liberal arts and professional. Difference in the American system of school education from the systems in other countries.
презентация [2,2 M], добавлен 08.06.2010Historical background, basic standards. A Brief Account of American Education: Differences and Similarities. American School from the Point of View of Russian Teenagers. What American Students think their Educational System. Alumni’s Experience.
реферат [23,1 K], добавлен 22.11.2010