DETAILED DESCRIPTION OF THE INVENTION
[0032] The functionality of the current invention is best described with initial reference to FIG. 1 . A controlled vehicle 101 contains an imager and an image processing system that is capable of acquiring and analyzing images of the region generally forward of the controlled vehicle. The imager and image processing system are preferably contained in the controlled vehicle's rear view mirror assembly 102 , thus providing a clear forward view 103 from a similar perspective as the driver through the windshield in the region cleaned by the windshield wipers. The imager may alternatively be placed in any suitable position in the vehicle and the processing system may be contained with the imager or positioned elsewhere. A host of alternate configurations are described herein, as well as, within various incorporated references. The image analysis methods described herein may be implemented by a single processor, such as a microcontroller or DSP, multiple distributed processors, or may be implemented in a hardware ASIC or FPGA.
[0033] The imager acquires images such that the head lamps 104 of oncoming vehicle 105 and the tail lamps 106 of preceding vehicle 107 may be detected whenever they are within an area where the drivers of vehicles 105 or 106 would perceive glare from the head lamps of controlled vehicle 101 . When head lamps or tail lamps are detected, the high beams of controlled vehicle 101 may be switched off or the beam pattern may be otherwise modified in such a way as to reduce glare to the occupants of other vehicles.
[0034] An imager 200 for use with the present invention is shown in FIG. 2 . A lens 201 containing two separate lens elements 202 and 203 forms two images of the associated scene onto an image sensor 204 . One image of the scene is filtered by a red filter 205 placed on the surface of the image sensor 204 and covering one half of the pixels. By comparing pixels in the filtered and non-filtered images corresponding to the same regions of space, the relative redness of light sources detected by those pixels can be determined. Other methods of color discrimination, such as the use of checkerboard red/clear filters, striped red/clear filters, or mosaic or striped red/green/blue filters may also be used. Detailed descriptions of optical systems for use with the present invention are contained in copending U.S. Pat. No. 6,130,421 entitled Imaging system for vehicle headlamp control and U.S. patent application Ser. No. 10/208,142 entitled Light Source Detection and Categorization system for Automatic Vehicle Exterior Light Control and Method of Manufacturing, commonly assigned with the present invention and incorporated herein in their entireties by reference.
[0035] Turning now to FIG. 3, a block diagram of an image sensor for use with the present invention is depicted. As shown, the imager comprises a pixel array 305 , a voltage/current reference 310 , digital-to-analog converters (DACs) 315 , voltage regulators 320 , low-voltage differential signal I/O 325 , a digital block 330 , row decoders 335 , reset boost 340 , temperature sensor 345 , pipeline analog-to-digital converter (ADC) 350 , gain stage 355 , crystal oscillator interface 360 , analog column 365 and column decoders 370 . Preferably, these devices are integrated on a common circuit board or silicon substrate. However, any or all of the individually identified devices may be mounted to a separate structure. Details of a preferred imager in accordance with that shown in FIG. 3 is described in detail in commonly assigned U.S. provision patent application ______ Attorney Docket AUTO 318V1, entitled IMAGE ACQUISITION AND PROCESSING SYSTEM, the disclosure of which is incorporated in its entirety herein by reference.
[0036] In a preferred embodiment, the imager is a CMOS design configured to meet the requirements of automotive applications. Preferably, the imager provides 144 columns and 176 rows of photodiode based pixels. Preferably, the imager also has provisions for sensing temperature, controlling at least one output signal, providing voltage regulation to internal components, and incorporated device testing facilities. Imager commands preferably provide control of a variety of exposure, mode and analog settings. The imager is preferably capable of taking two image subwindows simultaneously from different starting rows; this feature permits highly synchronized images in a dual lens system as described with reference to FIG. 2 . In a preferred embodiment, a single command instruction is sent to the imager and the imager then responds with two sequential images having unique exposure times. Another preferred option allows the analog gains to be applied in a checkerboard image for applications where a filter is applied to the imager in a checkerboard pattern. Preferably, data can be transmitted in ten bit mode, a compressed eight bit mode where a ten bit value is represented in eight bits (as described in more detail elsewhere herein), or a truncated eight bit mode where only the most significant eight bits of each ten bit pixel value is transmitted.
[0037] Turning to FIG. 4 , it is preferred that control and data signals are communicated between an image and an associated microprocessor via a low-voltage differential signaling serial peripheral interface (LVDS SPI) 405 . As shown in FIG. 4 , the LVDS SPI provides a communication interface between image sensor 410 and microcontroller 415 . The preferred LVDS SPI comprises a LVDS transceiver 420 , an incoming data logic block 425 , a dual port memory 430 , and a microcontroller interface logic block 435 . It should be understood that a host of known LVDS devices are commercially available and it is envisioned that LVDSs other than that shown in FIG. 4 may be utilized with the present invention. For example, the dual port memory may be omitted and the control and data signals will be transmitted directly over the LVDS link. A more detailed description of the LVDS SPI interface in accordance with that shown in FIG. 4 is contained in commonly assigned U.S. provision patent application ______ Attorney Docket AUTO 318V1, the disclosure of which is incorporated in its entirety herein by reference.
[0038] In a preferred embodiment, the dual port memory is provided to enable the microcontroller to perform other functions while image data is being sent from the imager. The microcontroller then reads the image data from the dual port memory once free to do so. Preferably, the dual port memory allows sequential access to individual memory registers one-by-one. In a special alternate mode readout, two read pointers are provided to allow alternate access to two different regions of memory. This feature is particularly beneficial when used along with the dual integration time feature of the image sensors. The image sensor will send two images sequentially having different integration times. In the alternating readout mode, the first pointer is set to the start of the first image and the second pointer to the start of the second. Thus, for each pixel location the first integration time pixel value is read out first followed by the pixel value from the second integration.
[0039] An image acquisition and analysis method of the present invention is described with reference first to FIG. 5 . The control proceeds as a sequence of acquisition and processing cycles 500 , repeated indefinitely whenever control is active. Cyclic operation may occur at a regular rate, for example once every 200 ms. Alternatively, the cyclic rate may be adjusted depending on the level of activity or the current state of the vehicle lamps. Cycles may be interrupted for other functions. For example, the processing system may also control an automatic dimming rear view mirror, a compass, a rain sensor, lighting, user interface buttons, microphones, displays, vehicle interfaces, telemetry functions, multiplexed bus communication, as well as other features. If one of these features requires processor attention, cycle 500 may be suspended, interrupted or postponed.
[0040] Cycle 500 begins with the acquisition of one or more images 501 that are, at least in part, stored to memory for processing. The corresponding images may be synthetic high dynamic range images as described further herein. Next, in step 502 , various objects and properties of these objects are extracted from the acquired images. These objects usually are light sources that must be detected and classified. The term “light source” as used herein includes sources that emit light rays, as well as, objects that reflect light rays. In step 503 the motion of light sources and other historical behavior is determined by finding and identifying light sources from prior cycles and associating them with light sources in the current cycle. Light sources are classified in step 504 to determine if they are vehicular head lamps, vehicle tail lamps, or other types of light sources. Finally, in step 505 , the state of the controlled vehicle lamps are modified, if necessary, based upon the output of step 504 and other vehicle inputs.
[0041] It should be understood that although the steps in FIG. 5 are shown as sequential, it is possible to alter the order of the steps or perform various steps in parallel. For example, as discussed in more detail below, the preferred object extraction algorithm requires only four or even as few as two rows of the image be stored in memory at any given time, thus facilitating at least partial object extraction in parallel with image acquisition. Also, an image acquisition method presented herein may synthesize a high-dynamic range (HDR) image through multiple exposures and then processes the high-dynamic range image after synthesis. Alternatively, the images with each exposure setting may be processed independently from each other. Finally, each of the steps in FIG. 5 need not complete before the next step begins. For example, once a light source is detected in step 502 , its historical information may be immediately determined in step 503 and it may be immediately classified in step 504 . Then the next light source, if any, may be identified in step 502 . It should also be understood that any of the steps of FIG. 5 may be beneficially applied to vehicle imaging systems independently of other steps, in various combinations with other steps or with prior art embodiments.
[0042] The wide range of light levels that must be detected by the imaging system presents a significant challenge. Bright head lamps are several thousand times more intense than distant tail lamps. Many of the techniques employed to distinguish lights from one another benefit from relatively accurate measures of brightness and color; therefore, saturation in the image due to brighter light sources may lead to misidentification. High dynamic range imagers have been developed that could be used beneficially; however, they remain fairly obscure and expensive. Details associated with creating a synthetic high dynamic range image are included in copending, commonly assigned, U.S. patent application Ser. No. ______entitled, Vehicle Vision System with High Dynamic Range, Attorney docket No. AUTO 218, the disclosure of which is incorporated herein in its entirety by reference. In at least one embodiment of the present invention, associated problems have been overcome through creation of a synthetic high dynamic range (HDR) image.
[0043] Referring to FIG. 6 , the process for acquiring and synthesizing a HDR image includes the acquisition of two or more images at different exposures to cover different brightness ranges. While any number of images may be taken at different exposure intervals, three images will be used, as an example, with exposure times of 1, 6, and 36 ms. In a preferred embodiment, an HDR is synthesized utilizing five images, each with a unique integration period, For example, with exposures of 0.25, 0.5, 2, 8 and 30 ms. As described herein, a preferred imager provides the ability to acquire two images with unique integration periods with a single command; it may be desirable to create a HDR utilizing two images having unique integration periods, for example using integration times between 0.5 and 50 ms. It may desirable, irrespective of the number of images utilized, to employ integration times ranging from 0.5 to 50 ms. It may be desirable to utilize any number of individual images, for example, a range of 1 to 10 images may be utilized. First, in step 601 , the image memory is zeroed. Next, in step 602 , the first image with the shortest exposure (1 ms) is acquired. Step 603 is irrelevant for the first image since the memory is all zeros.
[0044] Step 604 represents an optional step used to correct for fixed pattern imager noise. Most image sensors exhibit some type of fixed pattern noise due to manufacturing variances from pixel to pixel. Fixed pattern noise may be exhibited as a variance in an offset, a gain or slope or combination thereof. Correction of fixed pattern noise may improve overall performance by assuring that the sensed light level of an imaged light source is the same regardless of the pixel onto which it is imaged. Improvements in imager fabrication process may render this correction unnecessary.
[0045] If correction is warranted, correction in offset (step 604 ), slope (step 606 ) or both may be accomplished by the following method. To provide the correction, each sensor is measured during manufacturing and a pixel-by-pixel lookup table is generated that stores the offset and/or slope error for each pixel. In step 604 , the offset is corrected by adding or subtracting the error value stored in the table for the current (i th ) pixel. Slope correction may also be applied at this point by multiplying the pixel value by the slope error factor. However, since the image is preferably converted to logarithmic normalized values the slope correction may be applied by a less computationally expensive addition or subtraction to the logarithmic value in step 606 . In a modification of this method, several different pixel response ranges are identified and a corresponding correction look-up-table is created, each of which is identified as a particular bin. During manufacturing each pixel of an imager is measured and the nearest correction look-up-table is identified. The pixel is then assigned a bin number that is stored in the processors non-volatile memory. When images are taken during operation, the correction lookup table corresponding to the bin of the given pixel is applied and the imager uniformity is improved.
[0046] In step 605 , the pixel value (plus the optional offset correction from step 604 ) is converted for creation of the HDR image. This conversion first may include an optional step of linearization. Many pixel architectures may respond non-linearly to incident light levels. This non-linearity may be manifested as an S-shaped curve that begins responding slowly to increasing light levels, then more linearly, and then tapers off until saturation. Such a response may induce error when attempting brightness or color computations. Fortunately, the non-linearity is usually repeatable and usually consistent for a given imager design. This correction is most efficiently achieved through a lookup table that maps the non-linear pixel response to a linear value. If the non-linearity is a consistent function for all imagers of the same design, the lookup table may be hard-coded into the processor. Otherwise it may be measured and stored on a chip-by-chip basis, as is the case for fixed pattern noise correction. Sensors that exhibit a substantially linear response will not require linearity correction.
[0047] The value of each pixel output must also be scaled by the ratio between the maximum exposure and the current exposure. In the case of this example, the data from the 1 ms image must be multiplied by 36. Finally, to accommodate the wide dynamic range, it is beneficial to take the logarithm of this value and store it to memory. This allows for the pixel value to be maintained as an 8-bit number thus reducing the memory requirement. If sufficient memory is available, logarithmic compression may be omitted. While the natural log is commonly used, log base 2 may alternatively be used. Highly computationally efficient algorithms may be used to compute the log and anti-log in base 2 . The entire computation of step 605 , linearization, scaling, and taking the logarithm is preferably performed in a single look-up table. A lookup table with these factors pre-computed is created for each exposure setting and used to convert the value from step 604 to the value to be stored to memory. Alternatively, as described herein with reference to FIGS. 7, 8 , 9 a and 9 b , a 10-bit to 8-bit compression algorithm may be employed.
[0048] Finally, if fixed pattern noise correction is used, the slope error correction may be applied in step 606 to the logarithmic value from step 605 . The final value is stored to memory in step 607 . This entire process proceeds for each pixel in the image as indicated by step 608 . Once the first image is stored, the next higher exposure image may be acquired. Processing for this and all subsequent images proceeds similarly except for step 603 . For the second and later images, values are only stored to memory if no value from a lesser sensitivity image was detected. If a value is currently in memory there is no need for the value, that is likely saturated or nearer saturation, from a higher sensitivity image. Essentially, the higher sensitivity images simply serve to “fill in the blanks” left by those pixels that did not sense any light in prior images. Finally, when the highest exposure (36 ms in this example) image is acquired, no scaling will be necessary.
[0049] With reference to the above discussion, the skilled artisan may identify many variations to the above method that are within the spirit of the present invention. For example, the process may occur backwards, beginning with the highest sensitivity image. In this case, pixels that are saturated from the higher sensitivity images may be replaced by non-saturated pixels from lower sensitivity images. Multiple images may be taken at each sensitivity and averaged to reduce noise. Functions other than the log function may be used to compress the range of the image such as deriving a unity, normalized, factor. Bit depths other than 8-bits may be used to acquire and store the image such as 9-bits, 10-bits, 16-bits, 32-bits and 64-bits. Finally, methods other than varying the exposure time, such as varying gain or A/D conversion parameters, may be used to alter the sensitivity of the acquired images.
[0050] Finally, it is also possible to independently store individual images of different sensitivities rather than store a single synthetic high dynamic range image. This method is useful when sufficient memory is available to store more than one image, as may be the case when a memory buffer is provided as discussed with regards to the LVDS SPI interface of FIG. 4 , discussed in greater detail herein below. In this case, pixel value is chosen from the appropriate exposure image and appropriately scaled during the object detection of step 502 .
[0051] Dynamic range compression of image grayscale values may also occur in hardware, either as a feature provided on chip with the image sensor or through associated circuitry. This is especially beneficial when 10 bit or higher resolution A/D converters are provided, since many bus communication protocols, such as the SPI bus, typically transmit data in 8-bit words or multiples of 8 bits. Thus a 10-bit value would be usually be transmitted as a 16-bit word and actually take twice the bandwidth and memory of an 8-bit value. For camera based control functions such as the head lamp control function, the requirements for reading resolution are usually more closely aligned with constant percent of reading than with constant percent of full scale. The percentage change of a linearly encoded variable is a constant percent of full scale for each incremental step in the reading whereas the percentage change in the linear value corresponding to its logarithmically encoded counterpart is a constant percent of the linear reading for each incremental step in its associated log encoded value. With linear encoding the incremental change for a small value which is close to zero is a very large percent of the reading or value and the incremental change for a large value which is close to full scale is a very small percent of the reading or value. In a camera analog to digital converter, the conversion is normally linear and must be converted or mapped to another form when such a conversion is needed.
[0052] Unless it is stated otherwise, it will generally be assumed that incremental accuracy refers to values already in or converted back to their linear range. For linearly encoded values which are close to zero, the incremental step is a large percentage of the reading and mapping these into readings where the incremental change in the associated linear value is smaller will result in single input values being mapped into multiple output values. An object of encoding values from a larger to a smaller set is to preserve necessary information with a smaller number of available bits or data points to encode the values. For example, in converting a 10 bit value to a compressed 8 bit value, the available number of data points drops by a factor of four from 1024 in the input set to 256 in the converted output set. To make effective use of the smaller number of available points, a given number of input codes in the larger input space should not in general map into a larger number of codes in the output space. If this is done, for example in the 10 bit to 8 bit conversion, it will not leave as many points in the 8 bit output space where lossy compression is required to map the larger number 10 bit codes into the much smaller number of 8 bit codes. From this we can see that the conversion mapping needs to be planned so that for each range of the input values to be mapped, the desired information is preserved while being sparing in the use of output codes. For small values, the available information is normally needed and any encoding losses, including round off errors, may be objectionable so a prudent approach is to map small values directly to the output space without conversion other than the possible addition or subtraction of a constant value. Logarithmic encoding is desirable for larger values to maintain an approximately equal percentage change of the associated linear input value for each incremental step in the output range. The logarithm also has the desirable property that the effect of the application of a constant multiplier in the linear domain may be offset by the subtraction of the log of this multiplier in the log domain. Thus, as is normally done when using logarithms for calculation, a variant of scientific notation may be used applying a multiplier and expressing the number as a value in a specified range times an integral power of this range. For binary numbers, it is normally most convenient to choose a range of two to one, an octave, and to express the number as a normalized value which spans one octave times a power of two. Then for the log range, depending on the output codes available, the number of output values per octave may be chosen.
[0053] It should be understood that many monotonic linearization algorithms may be used in addition to a logarithmic linearization for data compression. Additionally, non-decreasing algorithms may be employed for data compression.
[0054] A convenient definition of resolution expressed as a percent or fraction of linear reading is need for the discussion. This may be defined for a given output value as the ratio of the difference of the linear equivalent of the next value in the output sequence of values minus the linear equivalent of the given output value to the linear equivalent of the given output value. Let the i th output value in the decoder output sequence be expressed as O(i) and let the linear equivalent of this value be expressed as LInv(O(i)). Let the defined linear reading based resolution be denoted by RIrb(O(i)). Then
RIrb ( O ( i ))=100*( LInv ( O ( i+ 1))− LInv ( O ( i )))/ LInv ( O ( i )) (1)
[0055] For a logarithmic encoding with n values per octave, RIrb is constant (neglecting conversion round off errors) for the logarithmically encoded values and is
RIrb ( O )=100*(exp(log(2)/ n )−1) (2)
[0056] where exp(x) is the natural number e raised to the power x and log(x) is the natural log of x.
[0057] For a linear one to one output encoding
O ( i )= i (3)
[0058] and
RIrb ( i )=100 /i (4)
[0059] As an example, for encoding a ten bit input as an 8 bit compressed output, map the first 64 input values, 0-63, directly to 0-63 of the output and then logarithmically map each of the four octaves, 64-127, 128-255, 256-511, and 512-1023, respectively, to 48 count output ranges, 64-111,112-159,160-207, and 208-255. Then from equation (2), RIrb is approximately equal to 1.45% per increment for values in the logarithmic conversion range which maps input range 64-1023 to output range 64-255. For the top end, 63, of the linear range, from equations (3) and (4), RIrb(63) is approximately equal to 1.59% per increment which is close to 1.45% per increment for the logarithmic encoding making it a good place for the transition from linear one to one mapping to logarithmic mapping. In fact in the preferred implementation for which the input to output mapping is depicted by the curve in FIG. 7 , the log conversion for the octave from 64 through 127 maintains the one to one mapping of input to output through value 77. By appropriately shifting the input data, the same one octave linear to log conversion may be used for each of the four octaves. With the encoding, a variable which is greater than another in the output range assures that the same relation held for the related pair of values in the input range.
[0060] Cameras which incorporate stepwise linear compression are known to the inventor as are cameras with sensing arrangements which have a nonlinear and perhaps logarithmic light sensing characteristic to achieve an extended range. Cameras which combine ranges so that part of the output range is linear and part is logarithmic are not known. No cameras for the headlamp dimmer application which incorporate any form of compression in the camera module are known to the inventor.
[0061] A preferred embodiment of the invention is detailed in block diagram form in FIGS. 9 a and 9 b . The implementation described is a combinatorial circuit but sequential or asynchronous implementations are within the scope of the invention. Ten bit digital input signal in 10 [9:0] ( 901 ) is input to the circuit and the combinatorial output is eight bit signal out 8 [7:0] ( 902 ).
[0062] In block 903 , one high range indication signal bd[4:0] is generated with one of the 5 lines of bd[4:0] high and the others zero for each of the input ranges as indicated. The input value ranges for in 10 [9:0] are shown in the first column in decimal as numbers without underscore separators or a 0x prefix. The output numbers prefixed by 0x are in hexadecimal format. Binary numbers in block 308 are indicated by an underscore separating each group of four binary 0 and 1 digits. These conventions will be used for each of the blocks in FIGS. 9 a and 9 b . A range designation from 0 to 4 is shown in the middle column of block 903 and is for convenience since the range is referenced so often in the logic and in this description. Input values which are in range 0 (Input values from 0 through 63) are passed directly to output out 8 [7:0] without alteration. Each of the other four ranges span one octave. (In these discussions, the octave is taken to include the lowest number and the number two times this number is included with the next octave so that each of the octave related input values is, by this definition, included in exactly one octave.) As will be detailed in the description of associated blocks, when an input value is in any of the four one octave ranges 1 through 4, the value is scaled and, or, offset according to which range it is in and mapped into a 48 output value range using a common decoder block in the logic. The one octave 48 step logarithmically related output value is then scaled and, or, offset according to the range that the input value is in and directed to the output.
[0063] In block 906 , the input value is scaled and, or, offset according to the range that it is in as indicated by the value of bd[4:0] and output as signal in 9 s[ 8:0] to the first block 908 of the logarithmic decoder. The logarithmic conversions are used for ranges 1 through 4 and due to the range classification criteria, the next higher bit which would be in 10 [6] to in 10 [9] for ranges 1 through 4, respectively, is always 1. Since this bit is always one and adds no variable information, it is omitted from the comparison and is also excluded as a leading tenth bit in the inverse log columns 3 and 6 of block 908 . For an input value in range 4, all nine of the variable bits are included in the comparison for the logarithmic conversion. For an input in range 3, the value is shifted left 1 as indicated by the multiply by 2 and a 1 is placed on the Isb, bit in 9 s[ 0]. The 1 in bit zero by subjective comparison yielded the smoothest conversion result. For an input in range 2, the value is shifted left 2 places and binary 10 is placed in the two least significant bits to provide a smooth conversion result. For an input in range 1, the value is shifted left 3 places and binary 010 is placed in the three least significant bits to provide a smooth conversion.
[0064] Blocks 908 , 909 , and 910 are used to perform the 10 bit binary o 48 step per octave logarithmic conversion with 0 to 47 as the output log[5:0]. Block 908 is a group of 48 compare functions used in the ensuing blocks in the conversion. The ge[x, in 9 s[ 8:0]] terms are true if and only if the 9 bit input ge[x, in 9 s[ 8:0]] is a value whose output log[5:0] is greater than or equal to x. These functions are useful because to test that an output log[5:0] for an input in 9 s[ 8:0]] is in a range which is greater than or equal to a but less than b the following expression may be used:
ge[a, in 9 s[ 8:0]] and not ge[b, in 9 s[ 8:0]]
[0065] Many such ranges must be decoded to provide logic expressions for each of the 6 bits in the 48 value output range. For convenience, in some of the Figs. and description, ge[x] will be used to mean the same thing as ge[x, in 9 s[ 8:0]].
[0066] Term ge[0, in 9 s[ 8:0]] is always true so does not appear explicitly in the ensuing terms. The value x in columns 1 and 4 is the index for the x th value of the octave and the zeroth value, x=0, is the start of the octave and the 47 th value, x=47, is the last value before the start of the next octave. ge[x, in 9 s[ 8:0]] is the function which represents the combinatorial logic function whose value is 1 if and only if in 9 s[ 8:0] is greater than or equal to the associated Inverse log(x) value shown in the third or sixth column of block 908 . As indicated before, the msb which is 1 is not shown. The inverse log values may be generated by the equation
exp(((x/48)+9)*log(2))
[0067] where exp(y) is the exponential function with the natural number e raised to the y th power and log(z) is the natural log of z. The value of the above ranges from 512 through the value which is one step before 1024 for which x would equal 48. Values for this function yield the desired octave (between successive octaves the value for x equal 48 is included as the value for x=0 in the next octave.). The most significant 1 bit is omitted in columns 3 and 6 of block 908 .
[0068] Because of the 47 ge[x, in 9 s[ 8:0]] terms which are used and for which logic circuits must be provided, it is advantageous to create common intermediate terms which may be shared for the many greater equal logic terms which are needed. Decoding circuits to indicate that specified ranges of consecutive bits in in 9 s[ 8:0] are all one are useful as are decoding circuits to indicate that specified ranges of consecutive bits are greater than or equal to one (not all zero). Such terms have been used extensively in the code to enable sharing of logic terms for the 47 decoder expressions which are implemented.
[0069] In block 909 , an optional gray code encoding stage is used and optionally, the encoding could be done directly in binary but would require a few more logic terms. The encoding for each of the six bits glog[0] through glog[5] of an intermediate gray code is performed with each of the glog bits being expressed as a function of ge[x] terms. The gray code was chosen because only one of the six bits in glog[5:0] changes for each successive step in the glog output value. This generates a minimal number of groups of consecutive ones to decode for consecutive output codes for each of the output bits glog[0] through glog[5]. Thus, a minimal number of ge[x] terms are required in the logic expressions in column 2 of block 909 .
[0070] In block 910 , the gray code glog[5:0] input is converted to a binary log[5:0] output.
[0071] In block 907 , the number to add to log[5:0] to generate the appropriate log based output value for inputs in ranges 1 through 4 is generated. The hexadecimal range of the in 10 [9:0] value is listed in the first column and the number to add to bits 4 through 7 of olog[7:0] is indicated in hexadecimal format in the second column. The third column indicates the actual offset added for each of the ranges when the bit positions to which the value is added are accounted for.
[0072] In block 905 , the offset value va[3:0] is added, bits 0 and 1 , to bits 4 through 5 , respectively, of log[5:0] and appropriate carries are generated into bits 5 , 6 , and 7 to generate 8 bit log based output olog[7:0].
[0073] In block 904 , the direct linear encoding in 10 [5:0] zero padded in bits 6 and 7 is selected for inputs in range 0 and the logarithmically encoded value olog[7:0] is selected for the other ranges 1 through 4 to generate 8 bit output out 8 [7:0].
[0074] FIG. 7 depicts the output 700 a as a function of the input 700 of a data compression circuit such as the one detailed in the block diagram of FIG. 8 . The input ranges extend in a first range from 0 to (not including) 701 and similarly in four one octave ranges from 701 to 702 , from 702 to 703 , from 703 to 704 , and finally from 704 to 705 . The first range maps directly into range 0 to (not including 701 a ) and the four one octave ranges map respectively into 48 output value ranges from 701 a to 702 a , from 702 a to 703 a , from 703 a to 704 a , and finally from 704 a to 705 a . In a preferred implementation, the output for each of the four one octave output ranges is processed by a common input to log converter by first determining which range and thus which octave, if any, the input is in and then scaling the input to fit into the top octave from 704 to 705 , then converting the input value to a 48 count 0-47 log based output. The offset at 701 a , 702 a , 703 a , or 704 a is then selectively added if the input is in the first, second, third or fourth octave, respectively. Finally, if the value is in range 0, the direct linear output is selected and otherwise, the log based value calculated as just described is selected to create the output mapping depicted by curve 710 .
[0075] FIG. 8 is a procedural form of the conversion detailed in the block diagram of FIGS. 9 a and 9 b . In block 801 the range that the input is in is determined. In block 802 the value is pre-scaled and, or, translated to condition the value from the range that the input is in to use the common conversion algorithm. In block 803 the conversion algorithm is applied in one or in two or possibly more than two stages. In block 804 , the compressed value is scaled and, or, translated so that the output value is appropriate for the range that the input is in. In block 806 , the compression algorithm of blocks 801 through 804 is used if the range that the input is in is appropriate to the data and the value is output in block 807 . Otherwise, an alternate conversion appropriate to the special range is output in block 806 . Extraction of the light sources (also referred to as objects) from the image generated in step 501 is preformed in step 502 . The goal of the extraction operation is to identify the presence and location of light sources within the image and determine various properties of the light sources that can be used to characterize the objects as head lamps of oncoming vehicles, tail lamps of leading vehicles or other light sources. Prior-art methods for object extraction utilized a “seed-fill” algorithm that identified groups of connected lit pixels. While this method is largely successful for identifying many light sources, it occasionally fails to distinguish between multiple light sources in close proximity in the image that blur together into a single object. The present invention overcomes this limitation by providing a peak-detect algorithm that identifies the location of peak brightness of the light source. Thereby, two light sources that may substantially blur together but still have distinct peaks may be distinguished from one another.
[0076] A detailed description of this peak detection algorithm follows with reference to FIG. 10 . The steps shown proceed in a loop fashion scanning through the image. Each step is usually performed for each lit pixel. The first test 1001 simply determines if the currently examined pixel is greater than each of its neighbors. If not, the pixel is not a peak and processing proceeds to examine the next pixel 1008 . Either orthogonal neighbors alone or diagonal and orthogonal neighbors are tested. Also, it is useful to use a greater-than-or-equal operation in one direction and a greater-than operation in the other. This way, if two neighboring pixels of equal value form the peak, only one of them will be identified as the peak pixel.
[0077] If a pixel is greater than its neighbors, the sharpness of the peak is determined in step 1002 . Only peaks with a gradient greater than a threshold are selected to prevent identification reflections off of large objects such as the road and snow banks. The inventors have observed that light sources of interest tend to have very distinct peaks, provided the image is not saturated at the peak (saturated objects are handled in a different fashion discussed in more detail below). Many numerical methods exist for computing the gradient of a discrete sample set such as an image and are considered to be within the scope of the present invention. A very simple method benefits from the logarithmic image representation generated in step 501 . In this method, the slope between the current pixel and the four neighbors in orthogonal directions two pixels away is computed by subtracting the log value of the current pixel under consideration from the log value of the neighbors. These four slopes are then averaged and this average used as the gradient value. Slopes from more neighbors, or neighbors at different distances away may also be used. With higher resolution images, use of neighbors at a greater distance may be advantageous. Once the gradient is computed, it is compared to a threshold in step 1003 . Only pixels with a gradient larger than the threshold are considered peaks. Alternatively, the centroid of a light source and, or, the brightness may be computed using a paraboloid curve fitting technique.
[0078] Once a peak has been identified, the peak value is stored to a light list (step 1004 ). While the peak value alone may be used as an indicator of the light source brightness, it is preferred to use the sum of the pixel values in the local neighborhood of the peak pixel. This is beneficial because the actual peak of the light source may be imaged between two or more pixels, spreading the energy over these pixels, potentially resulting in significant error if only the peak is used. Therefore, the sum of the peak pixel plus the orthogonal and diagonal nearest neighbors is preferably computed. If logarithmic image representation is used, the pixel values must first be converted to a linear value before summing, preferably by using a lookup table to convert the logarithmic value to a linear value with a higher bit depth. Preferably this sum is then stored to a light list in step 1005 and used as the brightness of the light source.
[0079] Computation and storage of the centroid of the light source occurs in step 1006 . The simplest method simply uses the coordinates of the peak as the centroid. A more accurate fractional centroid location may be computed by the following formula:
1
[0080] Where x is the x-coordinate of the peak pixel, y is the y-coordinate of the peak pixel and X and Y is the resulting centroid. Of course, neighborhoods other than the 3×3 local neighborhood surrounding the peak pixel may be used with the appropriate modification to the formula.
[0081] Finally, the color of the light source is determined in step 1007 . For the above discussion, it is assumed that an imaging system similar to that of FIGS. 2 and 3 is used and the red filtered image is used to locate the centroid and perform all prior steps in FIG. 10 . The red-to-white color ratio may be computed by computing the corresponding 3×3 neighborhood sum in the clear image and then dividing the red image brightness value by this number. Alternatively, only the pixel peak value in the red image may be divided by the corresponding peak pixel value in the clear image. In another alternative, each pixel in the 3×3 neighborhood may have an associated scale factor by which it is multiplied prior to summing. For example, the center pixel may have a higher scale factor than the neighboring pixels and the orthogonal neighbors may have a higher scale factor than the diagonal neighbors. The same scale factors may be applied to the corresponding 3×3 neighborhood in the clear image.
[0082] Misalignment in the placement of lens 201 over image array 204 may be measured during production test of devices and stored as a calibration factor for each system. This misalignment may be factored when computing the color ratio. This misalignment may be corrected by having different weighting factors for each pixel in the 3×3 neighborhood of the clear image as compared to that of the red image. For example, if there is a small amount of misalignment such that the peak in the clear image is ½ pixel left of the peak in the red image, the left neighboring pixel in the clear image may have an increased scale factor and the right neighboring pixel may have a reduced scale factor. As before, neighborhoods of sizes other than 3×3 may also be used.
[0083] For optical systems employing alternative color filter methods, such as a system using a mosaic filter pattern or striped filter pattern, color may be computed using conventional color interpolation techniques known in the art and “redness” or full color information may be utilized. Color processing may be performed on the entire image immediately following acquisition or may be performed only for those groups of pixels determined to be light sources. For example, consider an imaging system having a red/clear checkerboard filter pattern. The process depicted in FIG. 10 may be performed by considering only the red filtered pixels and skipping all the clear pixels. When a peak is detected, the color in step 1006 is determined by dividing the peak pixel value (that is a red filtered pixel) by the average of its four neighboring clear pixels. More pixels may also be considered for example four-fifths of the average of the peak pixel plus its four diagonal neighbors (also red filtered) may be divided by the four clear orthogonal neighbors.
[0084] Several other useful features may be extracted in step 502 and used to further aid the classification of the light source in step 504 . The height of the light source may be computed by examining pixels in increasing positive and negative vertical directions from the peak until the pixel value falls below a threshold that may be a multiple of the peak, ½ of the peak value for example. The width of an object may be determined similarly. A “seed-fill” algorithm may also be implemented to determine the total extents and number of pixels in the object.
[0085] The above described algorithm has many advantages including being fairly computationally efficient. In the case where only immediate neighbors and two row or column distant neighbors are examined, only four rows plus one pixel of the image are required. Therefore, analysis may be performed as the image is being acquired or, if sufficient dynamic range is present from a single image, only enough image memory for this limited amount of data is needed. Other algorithms for locating peaks of light sources in the image may also be utilized. For example, the seed fill algorithm used in the prior art may be modified to only include pixels that are within a certain brightness range of the peak, thus allowing discrimination of nearby light sources with at least a reasonable valley between them. A neural-network peak detection method is also discussed in more detail herein.
[0086] One potential limitation of the peak detection scheme discussed above occurs when bright light sources saturate the image, even when a HDR image is used or other very bright objects appear. In this case, the objects may be so bright or large that no isolated peak is detected and therefore the object would be ignored. This limitation may be overcome in a few ways. First, any single pixel that is either saturated or exceeds a maximum brightness threshold may be identified as a light source, regardless whether it is a peak or not. In fact, for very bright lights, the entire process of FIG. 5 may be aborted and high beam headlights may be switched off. In another alternative, the sum of a given number of pixels neighboring the currently examined pixel is computed. If this sum exceeds a high-brightness threshold, it is immediately identified as a light source or control is aborted and the high beam headlights are dimmed. Normally, two conditions are used to qualify pixels as peaks, the pixel must be greater than (or greater than or equal to) its neighbors and, or, the gradient must be above a threshold. For saturated pixels, the gradient condition may be skipped since gradient may not be accurately computed when saturated.
[0087] Significant clues useful for the discrimination of vehicular light sources from other light sources may be gained by monitoring the behavior of light sources over several cycles. In step 503 , light sources from prior cycles are compared to light sources from a current cycle to determine the motion of light sources, change in brightness of light sources, and, or, to determine the total number of cycles for which a light source has been detected. While such analysis is possible by storing several images over time and then comparing the light sources within these images, current memory limitations of low-cost processors make it more appealing to create and store light lists. Although, the concept of storing the entire image, or portions thereof, are within the scope of the present invention and should be considered as alternate approaches. It is more economical to store the lists of light sources found in one or more prior cycles and some, or all, of the properties of the individual light sources. These prior cycle lists may be examined to determine if a light source is detected in the current cycle that has a “parent” in the prior cycle.
[0088] Prior cycle light source parent identification is performed in accordance with FIG. 11 . The process in FIG. 11 occurs for all light sources from the current cycle. Each light from the current cycle is compared to all lights from the prior cycle to find the most likely, if any, parent. First, in step 1101 , the distance between the light source in the current cycle and the light source from the prior cycle (hereafter called current light and prior light) is computed by subtracting their peak coordinates and then compared to a threshold in step 1102 . If the prior light is further away than the threshold, control proceeds to step 1105 and the next prior light is examined. The threshold in step 1102 may be determined in a variety of ways including being a constant threshold, a speed and/or position dependent threshold, and may take into account vehicle turning information if available. In step 1103 the distance between the prior light and current light is checked to see if it is the minimum distance to all prior lights checked so far. If so, this prior light is the current best candidate for identification as the parent. Another factor in the determination of a parent light source is to compare a color ratio characteristic of light sources of two images and, or, comparison to a color ratio threshold. It is also within the scope of the present invention to utilize a brightness value of determination of a parent light source. As indicated in step 1105 , this process continues until all lights from the prior cycle are checked. Once all prior lights are checked, step 1106 determines if a parent light was found from the prior cycle light list. If a parent is identified, various useful parameters may be computed. In step 1107 , the motion vector is computed as the X and Y peak coordinate differences between the current light and the parent. The brightness change in the light source is computed in step 1108 as the difference between the current light and the parent light. The age of the current light, defined to be the number of consecutive cycles for which the light has been present, is set as the age of the parent light plus one. In addition to these parameters averages of the motion vector and the brightness change may prove more useful than the instantaneous change between two cycles, due to noise and jittering in the image. Averages can be computed by storing information from more than one prior cycle and determining grandparent and great-grandparent, etc. light sources. Alternatively a running average may be computed alleviating the need for storage of multiple generations. The running average may, for example, take a fraction (e.g. ⅓) of the current motion vector or brightness change plus another fraction (e.g. ⅔) of the previous average and form a new running average. Finally, light lists containing the position information and possibility other properties such as the brightness and color of detected light sources may be stored for multiple cycles. This information may then be used for the classification of the objects from the current cycle in step 504 .
[0089] More advanced methods of determining light history information will be appreciated by one skilled in the art. For example, determination of the most likely prior light source as the parent may also consider properties such as the brightness difference between the current light source and the prior light source, the prior light source's motion vector, and the color difference between the light sources. Also, two light sources from the current cycle may have the same parent. This is common when a pair of head lamps is originally imaged as one light source but upon coming closer to the controlled vehicle splits into two distinct objects.
[0090] The trend in motion of an object may be used to select which of multiple objects from a prior image is the parent of the current object under consideration. Techniques for the tracking motion of objects are known in the fields of image and video processing and in other fields, such as for example the tracking of radar targets. These methods may be employed where appropriate and practical. Classification step 504 utilizes the properties of light sources extracted in step 502 and the historical behavior of light sources determined in step 503 to distinguish head lamps and tail lamps from other light sources. For summary, the following properties have been identified thus far: peak brightness, total brightness, centroid location, gradient, width, height and color. The following historical information may also be used: motion vector (x & y), brightness change, motion jitter, age, average motion vector and average brightness change. Additional properties may be identified that can improve discrimination when utilized with the classification methods presented below. In addition to the parameters extracted from image processing, various vehicle state parameters may be utilized to improve classification. These may include: vehicle speed, light source brightness that corresponds to the controlled vehicle's exterior light brightness (indicative of reflections), ambient light level, vehicle turn rate (from image information, steering wheel angle, compass, wheel speed, GPS, etc.), lane tracking system, vehicle pitch or yaw, and geographic location or road type (from GPS). Although specific uses for individual parameters may be discussed, the present invention should not be construed as limited to these specific implementations. Rather, the goal of the present invention is to provide a generalized method of light source classification that can be applied to any, or all, of the above listed parameters or additional parameters for use in identifying objects in the images. Finally, the classification of light sources may be supplemented by information from other than the image processing system, such as radar detection of objects, for example.
[0091] An example classification scheme proceeds in accordance with FIG. 12 . The control sequence of FIG. 12 repeats for each light source identified in the current cycle as indicated in 1212 . In the first step 1201 , the brightness of the light source is compared to an immediate dim threshold. If the brightness exceeds this threshold, indicating that a very bright light has been detected, the processing of FIG. 12 concludes and the high beams are reduced in brightness, or the beam pattern otherwise modified, if not already off. This feature prevents any possible misclassification of very bright light sources and insures a rapid response to those that are detected.
[0092] Step 1202 provides for the discrimination of street lights by detecting a fast flickering in intensity of the light sources, which is not visible to humans, resulting from their AC power source. Vehicular lights, which are powered from a DC source, do not exhibit this flicker. Flicker may be detected by acquiring several images of the region surrounding the light source at a frame rate that is greater than the flicker rate, preferably at 240 Hz and most preferably at 480 Hz. These frames are then analyzed to detect an AC component and those lights exhibiting flicker are ignored (step 1203 ). Additionally, a count, or average density, of streetlights may be derived to determine if the vehicle is likely traveling in a town or otherwise well lit area. In this case high beam use may be inhibited, or a town lighting mode activated, regardless of the presence of other vehicles. Details of this analysis are provided in previously referenced U.S. patent application Ser. No. 09/800,460, which is incorporated in its entirety herein by reference. An alternative neural network analysis method is discussed in more detail.
[0093] A minimum redness threshold criterion is determined with which the color is compared in step 1204 . It is assumed that all tail lamps will have a redness that is at least as high as this threshold. Light sources that exhibit redness greater than this threshold are classified through a tail lamp classification network in step 1205 . The classification network may take several forms. Most simply, the classification network may contain a set of rules and thresholds to which the properties of the light source is compared. Thresholds for brightness, color, motion and other parameters may be experimentally measured for images of known tail lamps to create these rules. These rules may be determined by examination of the probability distribution function of each of the parameters, or combinations of parameters, for each classification type. Frequently however, the number of variables and the combined effect of multiple variables make generating the appropriate rules complex. For example, the motion vector of a light source may, in itself, not be a useful discriminator of a tail lamp from another light source. A moving vehicle may exhibit the same vertical and horizontal motion as a street sign. However, the motion vector viewed in combination with the position of the light source, the color of the light source, the brightness of the light source, and the speed of the controlled vehicle, for example, may provide an excellent discriminate.
[0094] In at least one embodiment, probability functions are employed to classify the individual light sources. The individual probability functions may be first second, third or fourth order equations. Alternatively, the individual probability functions may contain a combination of terms that are derived from either first, second, third or fourth order equations intermixed with one another. In either event, the given probability functions may have unique multiplication weighting factors associated with each term within the given function. The multiplication weighting factors may be statistically derived by analyzing images containing known light sources and, or, obtained during known driving conditions. Alternatively, the multiplication weighting factors may be derived experimentally by analyzing various images and, or, erroneous classifications from empirical data.
[0095] The output of the classification network may be either a Boolean, true-false, value indicative of a tail lamp or not a tail lamp or may be a substantially continuous function indicative of the probability of the object being a tail lamp. The same is applicable with regard to headlamps. Substantially continuous output functions are advantageous because they give a measure of confidence that the detected object fits the pattern associated with the properties and behavior of a head lamp or tail lamp. This probability, or confidence measure may be used to variably control the rate of change of the controlled vehicle's exterior lights, with a higher confidence causing a more rapid change. With regard to a two state exterior light, a probability, or confidence, measure threshold other than 0% and 100% may be used to initiate automatic control activity.
[0096] In a preferred embodiment, an excellent classification scheme that considers these complex variable relationships is implemented as a neural network. The input to this network are many of the previously mentioned variables that may include, for example, the brightness, color, position, motion vector, and age of the light source along with the vehicle speed and turn rate information if available. More details of the construction of this neural network will be presented herein upon completion of the discussion of the control sequence of FIG. 5 . The rules for classification, or the neural network used, may be different if the high beams are off than if they are on. For example, a classification scheme that tends to favor classifying objects as a tail lamp whenever there is doubt may be used if the high beams are off to prevent the possibility of high beams coming on in the presence of another vehicle. However, when high beams are on, higher certainty may be required to prevent nuisance dimming of the high beams. Since the task of classification is simpler and not as critical when high beams are off, a simpler rule based classifier may be used in the off state and a more complex neural network used in the on state.
[0097] If the object is identified as a tail lamp in step 1206 , the classification process continues for the remaining light sources ( 1212 ) until all light sources are classified ( 1209 ). If the light source is not a tail lamp, it may be further tested to see if it is a head lamp. Similarly, light sources with redness levels below the threshold in step 1204 are tested to see if they are head lamps. First, in step 1207 the brightness of the light source is checked to determine if it is a candidate for a head lamp. The threshold of step 1207 may be a single threshold or, more preferably, is a function of position of the object, the current controlled vehicle's exterior lighting state, and optionally of the controlled vehicle's speed or other parameters. If the light source is brighter than the threshold, it is tested to determine if it is a head lamp. Step 1208 performs similarly to step 1205 , the classification for tail lamps.
[0098] The presence of a head lamp may be determined by a set of rules determined through experimentation or, most preferably by a neural network. The output of step 1208 may be a true/false indication of the presence of a headlamp of an oncoming vehicle or a measure of the likelihood that the object is a head lamp of an oncoming vehicle. As with step 1205 , the classification in step 1208 may be performed substantially different if the headlamps are on than if they are off. Similarly, the likelihood of an object being a tail lamp of a leading vehicle is determined.
[0099] As previously mentioned with regards to steps 1205 and 1208 , the present invention preferably utilizes one or more neural networks to classify detected light sources. Detailed descriptions of neural networks and their implementation for classification problems is provided in the books Neural Networks for Pattern Recognition , by Christopher M. Bishop and published by Oxford University Press (copyright 1995) and Practical Neural Network Recipes in C ++, by Timothy Masters and published by Academic Press (copyright 1993). Neural network algorithms may be designed simulated and trained using the software NeuroSolutions 4 available from NeuroDimension Inc., located in Gainesville Fla. The text of each of these references is incorporated in its entirety herein by reference.
[0100] A description of an example neural network for use with the present invention is given with reference to FIG. 13 . A neural network 1300 may consist of one or more inputs 1301 , input neurons 1302 , one or more outputs 1304 , hidden layer neurons 1305 , and connections 1303 , connections 1303 are also commonly referred to as synapses. For the purposes herein, the input neurons 1302 represent the parameters used for classification of light sources. The synapses between input neurons 1302 and the first hidden layer neurons 1303 represent weights by which these inputs are multiplied. The neurons 1303 sum these weighted values and apply an activation function to the sum. The activation function is almost always a non-linear function, and is preferably sigmoidal, such as a logistic or hyperbolic tangent function. Next, the output of these neurons is connected to the next layer of neurons by synapses that again represent a weight by which this value is multiplied. Finally, an output neuron provides an output value of the network 1304 . The network shown in FIG. 13 is a generalized structure. Any number of input neurons may be used and none or any number of intermediate hidden layers may be used, although only one or two hidden layers are typically necessary. The neural network is shown as fully connected, which means that the output of every neuron in one layer is connected by a synapse to every neuron in the next layer. Neural networks may also be partially connected.
[0101] The weight of each of the synapses are set to give the neural network its functionality and set its performance at a given pattern recognition or classification task. Weights are set by “training” the neural network. Training is performed by providing the neural network with numerous classified samples of the data to be classified. In the current invention, numerous light sources are captured by the imaging system, stored, and later manually classified by examining the images. Manual classification may occur by noting the actual type of light source when capturing the data or by later examination of the recorded data. To assist in manual classification additional video may be synchronously captured using a higher resolution or higher sensitivity imaging system. Finally, classification for training may also occur automatically using a more powerful video processing system than used for production deployment. Such an automatic system may use additional information, such as higher resolution video to assist in classification of the objects. In either case, the persons or automatic system used to classify the data which is then used to train a neural network (or used to develop other type of statistical classification algorithms) may be referred to as having “expert knowledge” of the classification problem.
[0102] Synapse weights may be initially set randomly and adjusted until the maximum achievable rate of correct classification of the training samples is achieved. Preferably additional manually classified samples are used to test the neural network to insure that it is able to generalize beyond the training data set. The previously mentioned NeuroSolutions program may be used to design the neural network and perform the training. Ideally, the minimum complexity neural network that satisfactorily performs the classification task is used to minimize the computational requirements of the system. Additional neurons, hidd