Title of Invention

'LOOK-UP TABLE AND METHOD FOR REDUCING THE PROGRAMMABLE ARCHITECTURE ELEMENTS IN A LOOK UP TABLE'

Abstract A system for reducing the number of programmable architecture elements in a look-up table required for implementing Boolean functions or operations that are identical or logically equivalent wherein a single set of storage elements connected to the inputs of multiple decoders and said storage elements being concurrently accessed by said decoders to provide simultaneous multiple outputs.
Full Text Field of the Invention
This invention relates to a Look-up table and method for reducing the programmable architecture elements in a Look Up Table for implementing Boolean functions or operations. In other words, sharing configuration data for high logic density on chip.
Background of the invention
The invention relates to programmable logic devices, particularly FPGAs. Lookup tables are highly configurable combinatorial logic devices. Their programming flexibility makes them desirable for use as basic building blocks in Programmable logic Devices(PLDs). Figure 1 illustrates the prior ait wherein the generic implementation of a look-up table (LUT) is shown. Generally a lookup table employs an array of programmable architecture elements, such as SRAM cells[2], to store data bits which are effectively used as output signals, each one of the data bits corresponding to a particular set of input signals[lb]. A particular data bit is coupled to the look-up table output terminable] by means of a decoding multiplexer circuitry[l] that is controlled by the look-up table input signals[lb]. The inputs[lb] & outputs[lc] of the look-up table usually connect to the routing resources of the CPLD/FPGA. In many cases, the output[lc] is also coupled to a sequential element (Flip-flop, latch, etc.). The inputs[lb] & outputs[lc] can of course be connected to anything.
A generic look-up table is used to implement any function that can be bounded within the inputs and outputs of the LUT. Bigger functions are broken down into smaller functions tailored to fit into the LUTs. The device routing resources suitably connect these LUTs. Typically, in a circuit netlist mapped into LUTs, multiple LUTs are used to implement identical or similar functionality. A good example would be an adder implementation in which the number of LUTs programmed to perform addition is directly proportional to the number of bits being added. It follows that a plurality of LUTs would be programmed with the same set of configuration bits. As will be seen, numerous such scenarios exist.
The object and summary of the invention
Thus, the object of this invention is to provide a technique for a look-up table architecture that reduces the number of LUTs to implement identical and/or logically equivalent functions.
To achieve the said objective, this invention provides a look-up table comprising: a plurality of decoders each having a plurality of inputs and an output and each having a plurality of selection lines;
a plurality of common storage elements connected to the inputs of said decoders;
a plurality of buffers respectively connected between said common storage elements and at least one of said decoders for driving the inputs of said at least one decoder; and
a programmable switch network selectively connecting said common storage elements to the inputs of said decoders;
said common storage elements being concurrently accessible by said decoders to simultaneously provide multiple outputs thereto.
The said common storage element comprises static random access memory (SRAM) storage elements.
The above look-up table comprises a controllable inverter connected to the output of at least one of said decoders.
The said controllable inverter has a static control. The said controllable inverter has a dynamic control.
The said look-up table comprises an exclusive OR gate connected to the output of at least one of said decoders.
The present invention further provides a method for reducing the number of programmable architecture elements in a lookup table as claimed in any one of the preceding claims, required for implementing Boolean functions or other operations that are identical or logically equivalent, the method comprising:
connecting a plurality of common storage elements to the inputs of said decoders using a programmable switch network;
connecting a buffer between each of the common storage elements and at least one of the decoders for driving the inputs of the at least one decoder; and
using the decoders to concurrently accesses the common storage elements to simultaneously receive multiple outputs therefrom.
The above method comprises inverting the output of at least one of the decoders.
The inverting comprises performing statically controlled inversion. The inverting comprises performing dynamically controlled inversion.
The instant invention also provides a method for performing logical operations comprising the steps of:
providing a look-up table as claimed in any one of claims 1 to 6 and 8,
selectively connecting the common storage elements to the inputs of the decoders using the programmable switch network, and
using the decoders to concurrently access the common storage elements to simultaneously receive multiple outputs therefrom, the decoders thereby cooperating with the storage elements to perform the logical operations.
Brief Description of the Drawings:
The invention will now be described with reference to the accompanying drawings:
Figure 1 shows the prior art of the look-up table with one decoder
Figure 2 shows the look up table according to the invention having multiple
decoders
Figure 3 shows look up table according to invention having a controllable inverter
or XOR gate at the output of decoders.
Figure 4 shows look up table according to invention using a programmable switch
network
Detailed Description of the Drawings
Programmable look-up tables or LUTs spread across the device are the main logic resources of the CPLD/FPGA. Figure 1 illustrates the prior art wherein the basic LUT structure is coupled to a single decoding multiplexer. The LUT shown in figure 1 is a four input type having sixteen SRAM latches. Any Logic function up to four inputs and one output can be implemented in this LUT. During circuit implementations in the CPLD/FPGA, a major chunk of the combinatorial logic is absorbed by the LUTs. The LUTs are appropriately connected through the device routing resources to match the mapped Boolean network. Referring to figure 2, the SRAM latches [2] are seen connected to three decoding multiplexer[1]. The sixteen inputs [la] of the decoding multiplexers are fed from the SRAM latches [2].The four select
lines[lb] of each of the multiplexers are unique and unrelated. The same uniqueness holds good for multiplexer outputs[lc]. It implies that these three multiplexers can independently function as three four input look-up tables with identical configuration bits that are made available to the multiplexer inputs[la] via a latch data bus[4]. As will be appreciated by those skilled in the art, designs involving repetitive LUT bit patterns can be readily accommodated in the latch-multiplexer arrangement in figure 2. Most common designs with this property are adders, subtracters, multipliers and counters. In a circuit netlist, among the many Boolean functions mapped into LUTs, a similarity in the terms of logical equivalence exists. The following description explains the concept of logical equivalence.
Logical Equivalence:
Two or more Boolean functions are said to be logically equivalent if they can be implemented in look-up tables with identical configuration bits. Considering a four input, single output look-up table of the preferred embodiment, examples of logical equivalent functions are :
1. ABC+D, AB+C, ABC, A+B, AB
2. AB-+CD, AB-+C, A+BC, A-+BC, AB~, AB, A+B, A+B~
3. AB(C+D~), A(B+C~), ABC, ABC~, AB+C-, AB-+C, A+B, A+B~, AB,
AB~
It is apparent that the functions will have the same set of configuration bits. The four variable function is the parent function and the rest of them are its derivatives or children. The children have lesser number of variables. Thus, some of the LUT pins will remain unused or will have to be pulled up/down while implementing the children functions. It might also be possible that the children functions have the same number of variables as their parent functions. Such children & many more functions can be derived from the parent function if one or more inputs to the multiplexer select lines are connected through controlled inverters. Still more are possible, if the multiplexer's output is also invertible.
Figure 3 illustrates the insertion of a controlled inverter/XOR gate[3] before the output of the decoding multiplexer in order to increase the number of derivative functions. The inverter contro.l[3a] may be static or dynamic.
If the sixteen latches of the preferred embodiment source several decoding multiplexers, buffers[4a] might have to be inserted at the latch outputs or in between the latch data bus[4] in order to communicate the latch data to distant multiplexers. As will be appreciated by those skilled in the art, the lines conveying latch signals to the decoding multiplexers need not be optimized for delays if the CPLD/FPGA is not intended to be used as an extremely rapid
reconfigurable device. Thus, in nearly all cases the latch data bus, which is sixteen bit wide here, may be minimum-width, with minimum spacing, and can follow any circuitous or meandering path to the decoding multiplexers. Therefore routing the latches to the decoding multiplexers is not a critical design issue and may be given last priority while designing the device.
Figure 4 is an interesting way of implementing the invention. [4] is the latch data bus that extends across the device. Programmable switches[4b] are inserted to distribute and route the latch data buses[4] originating from various LUT SRAM latches[2]. These latch data buses may be routed to nearby or far flung regions of the device using a programmable bus routing network[4]. The routing network[4] in no way introduces any extra delays in the final circuit implementation as static logic data resides on these latch data lines. So these latch data buses[4] can traverse long distances and use any number of programmable switches[4b & 4c] without affecting circuit performance. As all the nets in a bus are being routed at a time, configuration latch count is also low. It is of course possible to alter the multiplexer connectivity to the latch data bus and even include flexibility at the decoding multiplexer inputs[la] (sixteen here). Inputs to multiplexer [Ix] delineate such possibilities. In order to include more flexibility, programmable switches [4c] are inserted to couple latch data bus[4] to the multiplexer inputs[la] in the desired fashion.
The preferred embodiment of the present invention has thus provided a look-up table with a plurality of read poits capable of performing functions that are identical or logically equivalent with other variants possible. The area saved by employing such a scheme is appreciable without any delay overhead. In conclusion, the concept of sharing SRAM latches of LUTs reduces the number of SRAM latches and improves chip density. While, the above is a description of the preferred embodiment of the present invention, it is possible to use various alternatives, modifications and equivalents. For example, a memory architecture other than SRAM may be employed as the configuration host. These embodiments and others that become obvious in the light of the above disclosure are intended to fall within the scope of the present invention.






We claim:
1. A look-up table comprising:
a plurality of decoders (1) each having a plurality of inputs (1a) and an output and each having a plurality of selection lines;
a plurality of common storage elements (4) connected to the inputs of said decoders;
a plurality of buffers (4a) respectively connected between said common storage elements (4) and at least one of said decoders for driving the inputs of said at least one decoder; and
a programmable switch network (4b, 4c) selectively connecting said common storage elements to the inputs of said decoders;
said common storage elements (4) being concurrently accessible by said decoders to simultaneously provide multiple outputs thereto.
2. A look-up table as claimed in claim 1 wherein said common storage
elements comprises static random access memory (SRAM) storage
elements.
3. The look-up table as claimed in claim 1 or 2 comprising a controllable
inverter connected to the output of at least one of said decoders.
4. The look-up table as claimed in claim 3 wherein said controllable
inverter has a static control.
5. The look-up table as claimed in claim 3 wherein said controllable
inverter has a dynamic control.
6. The look-up table as claimed in claim 1 or 2 comprising an exclusive
OR gate connected to the output of at least one of said decoders.
7. A method for reducing the number of programmable architecture
elements in a lookup table as claimed in any one of the preceding
claims, required for implementing Boolean functions or other operations
that are identical or logically equivalent, the method comprising:
connecting a plurality of common storage elements to the inputs of said decoders using a programmable switch network;
connecting a buffer between each of the common storage elements and at least one of the decoders for driving the inputs of the at least one decoder; and
using the decoders to concurrently accesses the common storage elements to simultaneously receive multiple outputs therefrom.
8. The method as claimed in claim 7 comprising inverting the output of at
least one of the decoders.
9. The method as claimed in claim 8 wherein inverting comprises
performing statically controlled inversion.
10. The method as claimed in claim 8 wherein inverting comprises
performing dynamically controlled inversion.
11. A method for performing logical operations comprising the steps of:
providing a look-up table as claimed in any one of claims 1 to 6 and 8,
selectively connecting the common storage elements to the inputs of the decoders using the programmable switch network, and
using the decoders to concurrently access the common storage elements to simultaneously receive multiple outputs therefrom, the decoders thereby cooperating with the storage elements to perform the logical operations.
12. A look-up table substantially as herein described with reference to and
as illustrated in the accompanying drawings.
13. A method for reducing the number of programmable architecture
elements in a lookup table substantially as herein described with
reference to and as illustrated in the accompanying drawings.

Documents:

668-del-2001-abstract.pdf

668-del-2001-claims.pdf

668-del-2001-correspondence-others.pdf

668-del-2001-correspondence-po.pdf

668-del-2001-description (complete.pdf

668-del-2001-drawings.pdf

668-del-2001-form-1.pdf

668-del-2001-form-13.pdf

668-del-2001-form-18.pdf

668-del-2001-form-2.pdf

668-del-2001-form-3.pdf

668-del-2001-pa.pdf

668-del-2001-petition-others.pdf


Patent Number 230979
Indian Patent Application Number 668/DEL/2001
PG Journal Number 13/2009
Publication Date 27-Mar-2009
Grant Date 28-Feb-2009
Date of Filing 15-Jun-2001
Name of Patentee STMICROELECTRONICS PVT. LTD.
Applicant Address PLOT NO. 1, KNOWLEDGE PARK III, GREATER NOIDA - 201308, UP, INDIA.
Inventors:
# Inventor's Name Inventor's Address
1 ANKUR BAL KF-56 KAVI NAGAR, GHAZABAD, 201002, IINIA.
PCT International Classification Number G06R 9/34
PCT International Application Number N/A
PCT International Filing date
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 NA