Norwegian University of Science and Technology

Research

CAS Research - VLSI for Digital Signal Processing

Home Research Courses Staff Alumni
Below you will find abstracs of a number of doctoral thesises defended in this research field at the CAS group. They are followed by other useful links related to our research.

Can you see the integrated circuit?

On Power Consumption Issues in FIR Filters with Application to Communication Receivers: Complexity, Word length, and Switching Activity

Power consumption in CMOS VLSI circuits has in recent years become a major design constraint. This is in particular important for wireless networks, due to the limited life time of the batteries that wireless nodes are operating on.

Orthogonal Frequency Division Multiplexing (OFDM) is one example of a technique which in recent years has become widely applied in wireless communication systems. However, the performance of OFDM and other spectrally efficient schemes depends, to a large extend, on advanced digital signal processing (DSP) and on the use of efficient and possibly adaptive resource allocation and transmission techniques. These in turn require that accurate estimates of the channel are available in the receiver and transmitter.

However, accurate channel estimation of a time and frequency dispersive wireless fading channel calls for complex estimators, which might lead to significant power dissipation in such devices. Therefore, characterizing and analyzing power consumed by such devices under different channel conditions, and optimizing for power is important to reduce the overall power consumption of the system. In this thesis a certain chosen class of estimators, i.e., a linear FIR estimator, is considered, which is based on finite impulse response (FIR) filters. The work in this thesis considers the power related challenges in such estimators.

The power consumed by such estimators depends, in part, on the complexity of the estimator, i.e., the length of the FIR filter. The filter length is one of the factors affecting the estimation accuracy. An analysis of the relation between the performance of such estimators and the required complexity for these devices under different channel conditions, i.e., in the presence of noise, is performed in this thesis. In this study we show that a small increase in this noise can lead to a considerable increase in the required estimator complexity if a given Normalized Mean Square Error (NMSE) performance for the channel estimation must be upheld, in particular at medium-to-high Channel Signal to Noise Ratios (CSNR).

Furthermore, reducing the power consumption through word-length optimization, when realizing such estimators, is an attractive approach. Due to the characteristics of the input signal to such estimators, a special treatment of channel estimation error due to quantization of estimator filter coefficients is needed. In this thesis we investigate the impact of finite coefficient word length on channel estimator performance. A theoretical analysis of the increase in channel estimation error due to quantization of estimator coefficients is performed, and the behavior of this error in different fading environments and for different filter orders is studied.

The power consumed in a channel estimator is also influenced by the switching activity in the input signal of the estimator. Characterizing the switching activity in the input signal, including how this activity changes in different environments, e.g., in the presence of noise, is a subject of the work performed in this thesis. In this study we give an expression for direct calculation of the correlation coefficient for the most significant bit in a signal, using the word-level correlation coefficient. We also derive expressions for accurately calculating the variance (σ²) and word-level correlation coefficient (ρ) for a correlated signal, when an additional noise of a given variance is added to the signal. This can be used to estimate the bit-level switching activity in a signal in the presence of noise, based on the Dual Bit Type (DBT) method. The impact the additional noise has on the switching activity of a correlated signal has also been studied. These results make it possible for a designer to model the actual input switching activity in different real life noisy environments, enabling realistic power consumption estimation.

A study on switching activity reduction in estimator filters using a coefficient reordering method is another part of this thesis. Closed form analytical models for the coefficient and input data switching activity before and after reordering in an estimator filter is developed and the impact that coefficient reordering has on the input data, and consequently on the total switching activity, is studied. Using our derived models we show that the impact of coefficient reordering on data input increases first as the input signal correlation, ρz, increases, but this impact decreases again when ρz → 1. This impact is 0 for ρz ≈ 0 and ρz ≈ 1. Our results show that this impact is highest for ρz = 0.7 to ρz = 0.999, and becomes larger for large values of the estimator order N.

Considering a realistic case, we further study the possibility of reducing the switching activity in a MAC-based channel estimator when realized with different orders and word lengths, and operating in different environments. This study shows that if a designer makes the right choices when reordering, it can result in higher gain in reducing the switching activity. The decision will also depend on the channel condition in which the system is operating most of the time. The results of this study show that when the word length is reduced, the use of reordering can in some cases, e.g., when estimator order is increased to N = 50 and beyond, actually lead to an increase in total switching activity if extra care is not taken. It also shows that for large N and input data with medium to high correlation, it is not possible to reduce the switching activity using reordering if the word length is reduced to W = 8 or lower. When the word length is reduced the optimization in general becomes even more sensitive to the characteristics of the input data. The designer consequently need to have this information available to experience reduction or even avoid increase in switching activity for small values for W.

It should be mentioned that although we look at these power related challenges in the context of estimators, the results for several parts of this work is not limited to the channel estimators. The results concerning the switching activity reduction in MAC-based channel estimators can be generally applied to FIR filters, and the study on the input signal switching activity is valid for signals input to any digital signal processing (DSP) module.

[]On Power Consumption Issues in FIR Filters with Application to Communication Receivers: Complexity, Word length, and Switching Activity.
Doctoral thesis at NTNU, 2009:201, Asghar Havashki


Design of Low-Power Reduction-Trees in Parallel Multipliers

Multiplications occur frequently in digital signal processing systems, communication systems, and other application specific integrated circuits. Multipliers, being relatively complex units, are deciding factors to the overall speed, area, and power consumption of digital computers. The diversity of application areas for multipliers and the ubiquity of multiplication in digital systems exhibit a variety of requirements for speed, area, power consumption, and other specifications. Traditionally, speed, area, and hardware resources have been the major design factors and concerns in digital design. However, the design paradigm shift over the past decade has entered dynamic power and static power into play as well.

In many situations, the overall performance of a system is decided by the speed of its multiplier. In this thesis, parallel multipliers are addressed because of their speed superiority. Parallel multipliers are combinational circuits and can be subject to any standard combinational logic optimization. However, the complex structure of the multipliers imposes a number of difficulties for the electronic design automation (EDA) tools, as they simply cannot consider the multipliers as a whole; i.e., EDA tools have to limit the optimizations to a small portion of the circuit and perform logic optimizations. On the other hand, multipliers are arithmetic circuits and considering arithmetic relations in the structure of multipliers can be extremely useful and can result in better optimization results. The different structures obtained using the different arithmetically equivalent solutions, have the same functionality but exhibit different temporal and physical behavior. The arithmetic equivalencies are used earlier mainly to optimize for area, speed and hardware resources.

In this thesis a design methodology is proposed for reducing dynamic and static power dissipation in parallel multiplier partial product reduction tree. Basically, using the information about the input pattern that is going to be applied to the multiplier (such as static probabilities and spatiotemporal correlations), the reduction tree is optimized. The optimization is obtained by selecting the power efficient configurations by searching among the permutations of partial products for each reduction stage. Probabilistic power estimation methods are introduced for leakage and dynamic power estimations. These estimations are used to lead the optimizers to minimum power consumption. Optimization methods, utilizing the arithmetic equivalencies in the partial product reduction trees, are proposed in order to reduce the dynamic power, static power, or total power which is a combination of dynamic and static power. The energy saving is achieved without any noticeable area or speed overhead compared to random reduction trees. The optimization algorithms are extended to include spatiotemporal correlations between primary inputs. As another extension to the optimization algorithms, the cost function is considered as a weighted sum of dynamic power and static power. This can be extended further to contain speed merits and interconnection power. Through a number of experiments the effectiveness of the optimization methods are shown. The average number of transitions obtained from simulation is reduced significantly (up to 35% in some cases) using the proposed optimizations.

The proposed methods are in general applicable on arbitrary multi-operand adder trees. As an example, the optimization is applied to the summation tree of a class of elementary function generators which is implemented using summation of weighted bit-products. Accurate transistor-level power estimations show up to 25% reduction in dynamic power compared to the original designs.

Power estimation is an important step of the optimization algorithm. A probabilistic gate-level power estimator is developed which uses a novel set of simple waveforms as its kernel. The transition density of each circuit node is estimated. This power estimator allows to utilize a global glitch filtering technique that can model the removal of glitches in more detail. It produces error free estimates for tree structured circuits. For circuits with reconvergent fanout, experimental results using the ISCAS’85 benchmarks show that this method generally provides significantly better estimates of the transition density compared to previous techniques.

[]Design of Low-Power Reduction-Trees in Parallel Multipliers.
Doctoral thesis at NTNU, 2008:61, Saeeid Tahmasbi Oskuii


Implementation of synthesis filter bank for subband coding of images

A uniform filter bank structure is developed which retains the high coding gain of subband coders while having a complexity close to that of the discrete cosine transform (DCT). Reduced complexity is obtained by replacing the six upper channels by an 8 point DCT. By using longer filters in the two lower channels, blocking effects which is disturbing artifacts in transform coders, are eliminated.

The filter bank is required to handle HDTV sample rates at a minimum area cost and acceptable power consumption. Different algorithms and architectures for the filter bank structure are evaluated in order to satisfy these requirements. Through simulations and analysis the optimal wordlengths of coefficients and internal signals are found. The memory requirements between vertical and horisontal filtering is minimized so that a two-dimensional filter bank can be implemented on one chip. The DCT-part has been processed in 0.8 um CMOS technology and functional chips received. The one-dimensional filter bank will be processed shortly.

[]Analysis and VLSI design of synthesis filter bank for image subband coding. Ph.D. Thesis, Fys.El.-rapport 1997:33, Ingil Sundsbø


VLSI solutions for speech recognition

This work concentrates on high speed digital signal processing in CMOS both in general, and on applications in continous speakerindependent large vocabulary speech recognition.

Specifically, design automation with the TSPC and CDPD circuit techniques have been studied and methods for this developed. A general standard cell library in 0.8um CMOS suitable for synthesis of DSP algorithms have been developed and tested on fabricated test designs with satisfactory results. The library contains TSPC flipflops capable of a maximum clock frequency of 700 MHz, as well as a 1ns matching full adder primitive, and is fully compatible with commercial logic synthesis tools.

Two design examples demonstrate the performance improvements possible with the proposed library and design approach. First, a third order wave digital filter implementation with a typical sample rate of 300MHz which is an improvement with a factor of more than two compared to previous work on the same filter and in the same process.

Secondly, a pdf (probability density function) co-processor for speech recognition capable of performing 160 million subtract-square-multiply-accumulate operations per second which is comparable to the performance of a Cray super computer on the same problem.

[]Design Automation of High Speed Digital Signal Processing in VLSI with Applications in Speech Recognition Systems Based on Hidden Markov Models. Ph.D. Thesis, Fys.El.-rapport 1996:36, Johnny Pihl;


On VLSI Realization of a Low-Power Audio Coder with Low System Delay

This thesis is a contribution to low-power, low-voltage realization of digital signal processing algorithms in VLSI. The discussions are on algorithms, architectures and circuit level designs of a proposed audio encoder in a wireless digital microphone.

At the algorithmic level, a cosine-modulated filter bank has been compared to a parallel FIR filter bank, showing a complexity of only 1/5 in terms of arithmetic operations. By signal flow graph transformations, we have identified a suitable processing element, the X-PE. This X-PE has been shown to be efficiently realized by distributed arithmetic.

At the circuit level, 5 different full adder candidates have been compared, by full custom cell design, fabrication and measurement of 5 separate test chips. The best full adder, the SRPL2, is 50 % faster , but needs only 45 % of the power compared to a standard cell design at 2.4 V supply voltage. A double-edge triggered D-type flip-flop, the SRPL-DETFF, has been proposed, based on the SRPL technique. Simulation and test chip measurements have shown that the SRPL-DETFF is twice as fast, while the energy per operation is between 47 % and 80 %, as compared to a standard CMOS flip-flop below 2V. A bit-serial adder has been proposed based on these findings, requiring only 25-50 % of the energy per bit-operation, while exhibiting higher performance than a standard cell solution.

The simple delay model proposed by Hu has been exploited, exhibiting very close correspondence to detailed simulations and test chip measurements for supply voltages from 5 V to 1.1 V. The delay variations due to temperature and process variations have been successfully included. Sensitivity analysis and simulations suggest very conservative design margins with respect to timing at low supply voltages.

At the architectural level, we have proposed a bit-serial system architecture based on three distributed arithmetic X-PEs and two bit-serial processing elements. We have made evident that the proposed solution will safely operate from a single 1.2 V rechargeable battery cell, yielding a power consumption scaling of 1/17. Compared to a 5V realization of the FIR contestant, a power reduction by 5 x 17 = 85 can be obtained. The estimate of the power consumption for the filter bank is only 0.46 mW, clearly demonstrating that the proposed audio encoder may be realized, with no significant power disadvantage.

[]On VLSI Realization of a Low-Power Audio Coder with Low System Delay. Ph.D. Thesis, Fys.El.-rapport 1996:08, Tormod Njølstad


High Speed Cell Library in CMOS for Bit-Serial Implementation of DSP Algorithms

We have been working with a versatile new cell library developed for a maximum clock frequency of 640MHz@5V/0.8um. The cell library is targeted towards a hardwired bit-serial design style featuring an automatic standard cell based layout approach for implementation of DSP functional modules and circuits.

The high speed is obtained by employing an enhanced version of the True Single Phase Clock circuit technique. The selection of bit-serial operators is motivated by an optional link to existing commercial synthesis tools for mapping a behavioural algorithmic description into a bit-serial architecture at the register transfer level.

Four dedicated circuits have been fabricated in a 0.8um CMOS process to demonstrate functionality, performance and applicability. Test results confirm correct operation well above the target frequency of 640 MHz@5V.

[]A High Speed Cell Library in CMOS for Bit-Serial Implementation of DSP Algorithms. Ph.D. Thesis, Fys.El.-rapport 1996:05, Jan Egil Øye


Useful links

Arithmetic Module Generator

Research efforts in structures for high performance multipliers and adders with short design time have resulted in an Arithmetic Module Generator. The Module Generator was originally coded by a Master's student, Espen Sand, for whom Johnny Pihl (see thesis abstract above) was the supervisor. The Generator is capable of generating structural VHDL and Verilog code for fast multipliers, adders and subtractors of arbitrary word length. It also features a large number of structural options, especially for multipliers.

Best Paper Award at the 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

The paper "Robust spherical microphone array beamforming with multi-beam-multi-null steering, and sidelobe control" by Haohai Sun (IET), Prof. Shefeng Yan (formerly at IET, now at Institute of acoustics, Chinese academy of science, Beijing, China), and Prof. U. Peter Svensson (IET) was selected as the best paper at the biennial IEEE WASPAA conference at Mohonk Mountain House, New Paltz, NY, USA, 18-21 October 2009 (http://www.waspaa2009.com/). There were 181 paper submissions, 89 of which were selected for the conference after a peer-review process.

Circuits and Systems Group - Staff

Home Research Courses Staff Alumni

email Fjeldly, Tor A. - Professor
Phone: +47 64844747 Room: UniK
email Hergum, Ragnar - Associate Professor
Phone: +47 735 92023 Room: C336b
email Kjeldsberg, Per Gunnar - Professor
Phone: +47 735 94405 Room: B309
email Larsen, Bjørn B. - Associate Professor
Phone: +47 735 94493 Room: B317
email Svarstad, Kjetil - Professor
Phone: +47 735 92715 Room: B319
email Sæther, Trond - Professor
Phone: +47 735 94316 Room: B321
email Ytterdal, Trond - Professor
Phone: +47 735 94411 Room: B313
email Aas, Einar J. - Professor
Phone: +47 735 94317 Room: B311
email Cenkeramaddi, Linga Reddy - Research Fellow
Phone: +47 735 92793 Room: A362
email Goyal, Nitin - Research Fellow
Room: unik
email Halvorsrød, Thomas - Research Fellow
Phone: +47 735 94400
email Hammari, Elena - Research Fellow
Phone: +47 735 90704 Room: B412
email Hansen, Hans Herman - Research Fellow
Phone: +47 735 91436 Room: B312
email Hellandsvik, Are - Research Fellow
Phone: +47 735 91436 Room: B312
email Kaald, Rune - Research Fellow
Phone: +47 735 91445 Room: B312
email Monga, Udit - Research Fellow
email Nygård, Knut H. - Research Fellow
Room: UniK
email Palanichamy, Manikandan - Research Fellow
Phone: +47 735 91436 Room: B312
email Uchevler, Bahram Najafi - Research Fellow
Phone: +47 735 91436 Room: B312
email Vinje, Anders - Research Fellow
Phone: +47 735 91436 Room: B312
email Ye, Xu - Research Fellow
Phone: +47 735 91445 Room: B312

CAS Research - Hardware/Software Codesign and System Level Design

Home Research Courses Staff Alumni
Modern electronic systems grow in complexity, and are combinations of different types of components: microprocessors, signal processors, RAM/ROM, digital logic and analogue circuits. At the same time, designers meet demands for shorter development time and lower development cost. This incites the use of higher abstraction levels, so-called system level design. System level design requires system level tools that simultaneously handle both hardware and software, for modeling, partitioning, verification and synthesis of the complete system. Hardware/Software Codesign covers all of these topics, and as such draws on the experience from many different technological communities.

Can you see the integrated circuit?

Data Transfer and Storage Exploration

Many integrated circuit systems, particularly in the multi-media and telecom domains, are inherently data dominant. For this class of applications, data transfer and storage largely determine cost and performance parameters. This is the case for chip size, since large memories are usually needed, performance, since accessing the memories may very well be the main bottleneck, and power consumption, since the memories and buses consume large quantities of energy. Even for systems with caches, the overall storage requirement has vital impact on the performance and power consumption, since it greatly influences the number of slow and power expensive cache misses. For the system development process, the designer must hence concentrate first on exploring the data transfer and storage to achieve a cost optimized end product.

Multi-processor system-on-chip

The next generation of embedded systems will be dominated by industrial and consumer devices, which are able to deliver communications and rich, scalable multimedia content anytime, anywhere. These wireless communication and multimedia applications (e.g. 3D games, advanced medical observation and supervision systems) will lead to the creation of extremely complex and dynamic code with huge resource requirements. Such systems can typically not be handled by a single micro-processor. Instead, a System on Chip with multiple processing elements, an MPSoC, is required. The processing elements will be heterogeneous in nature, and can be anything from a standard processor to more specialized or even field programmable hardware units. The key characteristics of the applications will be the intensive computation, the large data transfer and storage requirement, and the need for efficient resource management. In such systems it is essential to optimize the utilization of computational resources both statically at design time, and dynamically at run-time.

We are cooperated closely with IMEC in Leuven, Belgium, regarding research in this domain. More information about the topic and the results of our work can be found at the following places:

Below you will find abstracs of a number of doctoral thesises defended in this research field at the CAS group.

Hierarchical Memory Size Estimation for Loop Transformation and Data Memory Platform Exploration

In today's embedded systems, the memory hierarchy is rapidly becoming a major bottleneck in terms of power, performance and area, due to the very large amount of (memory related) data need to be transferred and stored (temporarily). This is especially the case for portable multi-media applications systems. These applications are characterized by deep loop nests and multi-dimensional arrays at the high level. Due to the dramatically increasing size and complexity of system-on-a-chip (SoC) designs and stringent time-to-market requirement, the methodology and tools for chip design must be raised to the system level. Early analysis tools are particularly critical in enabling SoC designers to take full advantage of the many architectural options available. For memory optimization, the early high level techniques aim either to design an optimal memory platform for a given application or to optimize the application code in order to take advantage of the memory platform features, or even both. Loop transformation is such an important high level optimization technique. It modifies the execution order of loops and statements without changing the application functionality. Existing loop transformation algorithms are all performed based either on reduction of data access lifetime and on improvement in data locality and regularity to steer selection of loop transformations. These are, however, very abstract cost functions which do not represent the exact memory size requirement of the arrays and how the data will be mapped onto the memory platform later on. Existing algorithms all result in one final loop transformation solution. As different loop transformations may result in optimal utilization for different memory platform instances, ad-hoc decisions at this stage without estimating their impact on the actual hierarchy utilization can lead to a final sub-optimal solution. An evaluation of later design stages' effort is hence required. On the other hand, there usually exist a huge number of loop transformation possibilities, the estimation is required to be performed repeatedly and its computation time of the estimation technique also becomes critical to make it useful during the loop transformation search space exploration.

This dissertation proposes a memory footprint estimation methodology. An intra-array memory footprint estimation is performed first followed by an inter-array estimation. In order to achieve a fast estimate to make it useful repeatedly during the early high level search space exploration, several techniques have been introduced. A fast intra-array memory footprint estimation is performed at the iteration domain based on the maximal lifetime of data accesses, which is defined by the maximal dependency vector. Two approaches, an ILP formulation and vertexes approach, have been introduced for achieving a fast maximal dependency vector calculation. The fast inter-array estimation has been achieved based on several Hanoi tower based approaches.

A hierarchical memory size estimation methodology has also been proposed in this dissertation. It estimates the influence of any given sequence of loop transformation instances on the mapping of application data onto a hierarchical memory platform. As the exact memory platform instantiation is often not yet defined at this high level design stage, a platform independent estimation is introduced with a Pareto curve output for each loop transformation instance. It can steer the designer or an automatic steering tool to select all the interesting loop transformation instances that might later lead to low power data mapping for any of the many possible memory hierarchy instances. This is useful when the memory platform is not defined yet, or for a given memory hierarchy instance. It also allows to find the most appropriate low power memory hierarchy instance by performing an early power estimation of different memory hierarchy instances. Initially the source code is used as input for estimation, resulting in an initial approach. However, performing the estimation repeatedly from the source code is too slow for the large loop transformation search space exploration. An incremental approach, based on local updating of the previous result, is thus introduced to handle sequences of different loop transformations. Several advanced techniques have also been used on these two approaches in order to perform a fast estimation, such as bounding box geometrical model based data reuse analysis, platform independent memory hierarchy layer assignment estimation, fast intra- and inter-array memory footprint estimation.

The feasibility and usefulness of the methodologies are substantiated using several representative real-life application demonstrators. It shows for instance that the fast memory footprint estimation can be two order of magnitude faster than compared techniques while still achieving fairly accurate estimation result. For hierarchical memory size estimation methodology, the initial approach is two order of magnitude faster than the compared technique and the incremental approach is another two order of magnitude faster than the initial approach, which can just take a few milliseconds. The fast computation time of the incremental approach make it feasible to be used repeatedly during the loop transformation exploration over a very large number of possibilities. Furthermore, prototype CAD tools has been developed that includes mast parts of the methodologies.

[]Hierarchical Memory Size Estimation for Loop Transformation and Data Memory Platform Exploration. Doctoral thesis at NTNU, 2007:63, Qubo Hu

Storage Requirement Estimation and Optimization for Data Intensive Applications

Many integrated circuit systems, especially in the multi-media and telecom domains, are inherently data dominant. For this class of applications, data transfer and storage largely determine cost and performance parameters. This is the case for chip size, since large memories are usually needed. It is however also the situation for performance. Accessing the memories is in more and more cases the main bottleneck and the cache behavior is for a significant part determined by “capacity misses” which are related to size issues. Finally, it is the case for power consumption, since the memories and buses consume large quantities of energy and the “effective data size” influences their capacitive loading in case of RAMs and their miss-related activity in case of caches. In both cases, the power consumption per access is also heavily affected. During the system development process, the designer must hence concentrate first on exploring the data transfer and storage to produce a cost-optimized end product. At the system level, no detailed information is available regarding the size of the memories required for storing data in the alternative realizations of an application. To guide the designer and help in selecting the best solution, estimation techniques for the storage requirements are needed, very early in the system design trajectory. The simplest estimates use the maximal size of the intermediate array data as declared in the application code. This is however not representative for the effective size required for their storage during the actual execution since arrays and parts of arrays may not be alive simultaneously. An array element is alive from the moment it is written, or produced, and until it is read for the last time. This last read is said to consume the element. To achieve accurate estimates, the so-called in-place mapping opportunity generated by these non-overlapping lifetimes must be taken into account. For scalars, a relatively simple lifetime analysis suffices, but for arrays, this is extremely complex due to the huge number of signals and the often very complex interdependencies between them.

For our target classes of data dominant applications the high-level description is typically characterized by large multi-dimensional nested loops and arrays. Within the loop nests statements access the arrays using read and write operations. At the beginning of the design process, no information about the execution order of these loops is available, except what is given from the data dependencies between the statements in the code. As the process progresses, the designer makes decisions that gradually fix the ordering, until the full execution ordering is known. This execution ordering determines the order in which array elements are accessed and hence the lifetimes of the array elements. Since these lifetimes in turn influence the in-place mapping opportunity, the storage requirements of the arrays within the loop nests is largely determined by the execution ordering. To guide the designer it is therefore essential to have storage requirement estimates that can take the available partially fixed execution ordering into account during the exploration of the implementation solution space. Previous work has either not taken execution ordering into account at all, resulting in large overestimates, or required a fully specified ordering. In the last case, a time consuming full exploration of all possible alternative orderings of the unfixed loop dimensions is needed. This is infeasible during the early system design steps where fast feedback is needed to be able to explore the huge solution space.

The storage requirement estimation methodology proposed in this doctoral thesis solves these important design problems. The methodology is divided into four steps. In the first step, a data-flow graph is generated that reflects the data dependencies in the application code. The array accesses and the dependencies between them are described using a polyhedral model. The second step places the polyhedral descriptions of the array accesses and their dependencies in a so-called common iteration space. A worst-case and best-case placement may be performed, both taking available execution ordering into account during the placement. The third step estimates the upper and lower bounds on the storage requirement of individual data dependencies in the code, taking into account the available execution ordering. As the execution ordering is gradually fixed, the upper and lower bounds on the data dependencies converge. This is a very useful and unique property of the methodology. Finally, simultaneously alive data dependencies are detected. Their combined maximal size at any time during execution equals the total storage requirement of the application. An important part of the estimation technique utilizes loop ordering guidance to estimate upper and lower bounds on dependency sizes. These guiding principles and the proof of their validity are together with the general estimation methodology important contributions of this thesis. The guiding principles can be used for high-level synthesis independently of the storage requirement estimation methodology.

The feasibility and usefulness of the methodology are substantiated using several representative application demonstrators. It is for instance shown how the designer is guided into reducing the memory size of the major arrays in the MPEG-4 Motion Estimation Kernel from 262400 to 257 memory locations. Similar results are achieved for a Cavity Detection algorithm. Applying the methodology on an Updating Singular Value Decomposition algorithm, it is also demonstrated how estimation feedback during global loop reorganization can approximately halve the application's storage requirement. Furthermore, a prototype CAD tool has been developed that includes major parts of the storage requirement estimation and optimization methodology. Using manually generated design examples the tool proved the feasibility of the techniques and in particular showed that run times on computers will be short, in the order of seconds even for substantial applications.

[]Storage Requirement Estimation and Optimization for Data Intensive Applications. Fys.el.-rapport:2001:24, Per Gunnar Kjeldsberg

Circuits and Systems Group

Home Research Courses Staff Alumni

Head of group is Einar J. Aas.

The Circuits and Systems (CAS) group is one of the six research groups at the Department of Electronics and Telecommunications. The group consists of eight Professors and Associate Professors and approximately fifteen PhD students.

The CAS group field of expertise covers a broad area of microelectronics design and test. In particular, the group is active in these areas: VLSI digital signal processing, HW/SW co-design, design and test methodology, low-power digital design, analog and mixed signal design, system-on-a-chip design methodology, and device modeling. We cooperate closely with a large number of companies, in particular through the organization Mikroelektronikkforum (Microelectronics Forum). Several of the group members have also gained substantial R&D experience from semiconductor and design companies like Nordic Semiconductor, Kongsberg Seatex, Micron Imaging (Aptima Imaging), Texas Instruments (former Chipcon), Eidsvoll Elctronics, and SINTEF.


Can you see the integrated circuit?

Circuits and Systems Group - Courses

Home Research Courses Staff Alumni
The CAS group educates students in three programs: a five year sivilingeniør (MSc) degree in electronics (MTEL), a two year master degree in electronics for Norwegian students with a bachelor degree (MIEL), and a two year Erasmus Mundus International Master Program in Embedded Computing Systems (EMECS).

Norwegian students interested in applying to any of the two last programs, can find additional useful information following this link.


Can you see the integrated circuit?

Below you will find a list of courses given by group members and/or courses that are specifically relevant for our activity.

MSc courses

Information Icon TFE4100 Electric Circuits

Information Icon TFE4105 Digital Design and Computer Fundamentals

Information Icon TFE4110 Digital Design and Basic Electrical Circuits

Information Icon TFE4117 Introduction to Electronics

Information Icon TFE4140 Modelling and Analysis of Digital Systems

Information Icon TFE4151 Design of integrated circuits

Information Icon TFE4170 System-on-a-chip

Information Icon TFE4175 Realization and Test of Digital Components

Information Icon TFE4186 Analog CMOS 1

Information Icon TFE4191 Analog CMOS 2

Information Icon TFE4200 Analog Integrated Circuits

TFE4850 Experts in Team, Interdisciplanary Project

Information Icon TTT4100 Electronic Circuits

Specialization

Information Icon TFE4520 and TFE4521 Design of Digital Systems, Specialization Project (and Master Thesis)

Information Icon TFE4525 Design of Digital Systems, Specialization
TFE01 Low power digital design

TFE02 HW/SW Codesign of embedded systems

TFE05 High-level Synthesis and Verification

TFE04 Run-time reconfigurable systems

TFE03 Self-test of digitale systems

Information Icon TFE4540 and TFE4541 Analog and Mixed-Signal Design, Specialization Project (and Master Thesis)

Information Icon TFE4545 Analog and Mixed-Signal Design, Specialization
TFE06 ASIC for MEMS

TFE08 Data Converters

TFE09 Low Voltage/Low Power Analog Integrated Circuits

UNIK UniK-subject - see TFE4615 and TFE4625

PhD courses

FE8109 Design and Utilization of Memory Hierarcies in Multi-Media Applications

Information Icon FE8110 Low-Voltage/Low-Power Analog CMOS

Information Icon FE8113 High Speed Data Converters

Information Icon FE8114 High-Performance Audio DACs

FE8116 Nanoscale CMOS

Information Icon FE8119 Modelling theory for system on chip and embedded systems

Information Icon FE8120 Electronic Design Methodology

FE8121 VLSI Test Metodology

Information Icon FE8122 PhD Seminar in Circuits and Systems Design

Circuits and Systems Group - Research

Home Research Courses Staff Alumni
The CAS group field of expertise covers methods, techniques and tools for microelectronics design and test at circuit and system level. In particular, the group is active in these areas: VLSI digital signal processing, HW/SW co-design, design and test methodology, low-power digital design, analog and mixed signal design, system-on-a-chip design methodology, and device modeling.

Below you will find links to pages related to different projects and activities. The titles of recent PhD-thesises listed on our Alumni page also gives a good indication of our research activities. Finally, you can find our publications through a search for the group members in the Frida database.


Can you see the integrated circuit?

Smart Microsystems for Diagnostic Imaging in Medicine
Smart Microsystems for Diagnostic Imaging in Medicine: The main objective of this project is to develop robust microsystem technologies for invasive diagnostic imaging...

SiNANO
SINANO aims to strengthen European scientific and technological excellence in the field of electronics, Si-based nanodevices for terascale integrated circuits...

eMerge
The main objective of the project is to develop an innovative and advanced educational network structure that will permit the dissemination of real laboratory experiments to support engineering and science education within Europe...

CUBAN
The overall purpose of this research is to develop new types of xDSL systems that are better suited to serve as a part of the backbone network of a broadband wireless network...

Next Generation Lab
NGL provides a framework for WEB based simulators and laboratories which allow students to run simulations and measurements remotly in the field of electronics...

HW/SW Codesign and System Level Design

ASIC for MEMS

VLSI for digital signal processing

Circuits and Systems Group - Alumni

Home Research Courses Staff Alumni

Former PhD-students and staff of the Circuit and Systems group since its establishment in 1981.
Note: this is not a complete list. It is kept up to date with the best of the webmaster's knowledge.

Havashki, Asghar
PhD Thesis 2009 ''On Power Consumption Issues in FIR Filters with Application to Communication Receivers: Complexity, Word length, and Switching Activity''
Currently R & D Engineer at Nordic Semiconductors
Sing, Tajeshwar
PhD Thesis 2009 ''Capacitive Sensor Interface Circuits''
Currently Design Engineer at Atmel Norway
Wulff, Carsten
PhD Thesis 2008 ''Efficient ADC's for Nano-Scale CMOS Technology''
Currently R & D Engineer at Nordic Semiconductors
Børli, Håkon
PhD Thesis 2008 ''Modeling of Drain Current and Intrinsic Capacitances in Nanoscale Double-Gate and Gate-All-Around MOSFETs''
Løkken, Ivar
PhD Thesis 2008 ''Digital-to-Analog Conversion in High Resolution Audio''
Saha, Shimul Chandra
PhD Thesis 2008 ''RF MEMS Switch Circuits''
Oskuii, Saeeid Tahmasbi
PhD Thesis 2008 ''Design of Low-Power Reduction-Trees in Parallel Multipliers''
Currently Design Engineer at Texas Instruments Norway
Kolberg, Sigbjørn
PhD Thesis 2007 ''Modeling of Electrostatics and Drain Current in Nanoscale Double-Gate MOSFETs''
Hu, Qubo
PhD Thesis 2007 ''Hierarchical Memory Size Estimation for Loop Transformation and Data Memory Platform Exploration''
Currently Design Engineer at Atmel Norway
Sand, Åsmund
PhD Thesis 2007 ''Applying the Internet to Instrumentation and Metrology''
Gjermundnes, Øystein
PhD Thesis 2006 ''Exploiting Arithmetic Built-In Self-Test Techniques for Path Delay Fault Testing''
Currently Hardware Engineer at ARM Norway
Bjørnsen, Johnny Gudmund
PhD Thesis 2005 ''Design of a High-speed, High-Resolution Analog to Digital Converter for Medical Ultrasound Applications''
Currently manager (and founder) of Analog Concepts AS
Westby, Eskild Ronæss
PhD Thesis 2004 ''Macromodelling of Microsystems''
Njølstad, Tormod
Assistant Professor in the CAS group between 1990 and 1996
PhD Thesis 1996 ''On VLSI Realization of a Low-Power Audio Coder with Low System Delay''
Associate Professor in the CAS group between 1996 and 2003
Currently CEO of New Index
Aunet, Snorre
PhD Thesis 2002 ''Real-time reconfigurable devices implemented in UV-light programmable floating-gate CMOS''
Currently Associate Professor at Department of Informatics, University of Oslo
Sølhusvik, Johannes
Adjunct Associate Professor in the CAS group between 1998 and 2002
Currently Senior Research Engineer at Aptina Norway AS
Hernes, Bjørnar
PhD Thesis 2001 ''Design Criteria for Low Distortion in Feedback Opamp Circuits''
Currently Senior Design Engineer at Arctic Silicon Devices
Kjeldsberg, Per Gunnar
PhD Thesis 2001 ''Storage Requirement Estimation and Optimization for Data Intensive Applications''
Currently Professor at IET, NTNU
Strøm, Øyvind
PhD Thesis 2000 ''VLSI Realization of an Embedded Microprocessor Core with Support for Java Instructions''
Currently Product Line Director at Atmel Norway
Torp, Jon H.
Assistant Professor in the CAS group between 1987 and 1988
Associate Professor in the CAS group between 1988 and 1998
Currently retiree
Wichlund, Sverre
PhD Thesis 1998 ''Graph partitioning with application to VLSI system realization''
Currently Senior ASIC Designer at Nordic Semiconductors
Haaheim, Nils
Associate Professor in the CAS group between 1981 and 1985
Professor in the CAS group between 1985 and 1997
Currently retiree
Aaserud, Oddvar
Professor in the group between 1989 and 1997
Currently CEO of Venturos Venture AS
Sunsbø, Ingil
PhD Thesis 1997 ''Analysis and VLSI Design of Synthesis Filter Bank for Image Subband Coding''
Currently Senior ASIC Designer at Nordic Semiconductors
Bayegan, Markus
Adjunct Professor in the group between 1985 and 1997
Currently retired from the position as chief technology officer at ABB Ltd.
Pihl, Johnny
PhD Thesis 1996 ''Design Automation of High Speed Digital Signal Processing in VLSI with Applications in Speech Recognition Systems based in Hidden Markov Models''
Currently Senior Chip Integration Engineer at Atmel Norway
Øye, Jan Egil
PhD Thesis 1996 ''High Speed Cell Library in CMOS for Bitserial Implementation of DSP Algorithms''
Currently Senior Technical Manager Layout at Nordic Semiconductors
Kristoffersen, Linda
PhD Thesis 1992 ''Exact and Approximate Testability Analysis of Combinational Networks''
Currently Senior Test Engineer at Nordic Semiconductors
van Vo, Nhon
PhD Thesis 1992 ''A contribution to High Level Synthesis of VLSI Systems''
Currently Associate Professor at Institute for Microsystem Technology, Vestfold University College
Larsen, Bjørn B.
PhD Thesis 1991 ''Compact Test Sets by a Pin Toggle Test Generation Strategy''
Currently Associate Professor at IET, NTNU
Sæther, Trond
PhD Thesis 1991 ''ANALOG CMOS IC DESIGN Switched-Capacitor Circuit Cell Library with Integrated Substrate Shielding''
Currently Professor at IET, NTNU
Klingsheim, Karl
PhD Thesis 1990 ''VLSI DESIGN THEORY. Concurrent Decision Cycles in VLSI Systems Design''
Currently general manager of NTNU Technology Transfer
Svarstad, Kjetil
PhD Thesis 1989 ''VLSI building-blocks for PROLOG machines''
Currently Professor at IET, NTNU

The oxide electronics lab - Publications

Research Group members Facilities Collaborations Publications

Study of defect-dipoles in an epitaxial ferroelectric thin film,
C. M. Folkman, S. H. Baek, C. T. Nelson, H. W. Jang, T. Tybell, X. Q. Pan, and C. B. Eom, Appl. Phys. Lett. 96, 052903 (2010)

Structure and properties of multiferroic oxygen hyperstoichiometric BiFe1-xMnxO3+δ,
M. Selbach, T. Tybell, M.-A. Einarsrud, and T. Grande, Chem. Mater. 21, 5176 (2009)

Polarization direction and stability in ferroelectric lead titanate thin films,
Ø. Dahl, J.K. Grepstad, and T. Tybell, J. Appl. Phys. 106, 084104 (2009)

High-temperature semiconducting cubic phase of BiFe0.7Mn0.3O3+δ,
S.M. Selbach, T. Tybell, M.-A. Einarsrud, and T. Grande, Phys. Rev. B 79, 214113 (2009)

Epilayer control of photodeposited materials during UV photocatalysis,
R. Takahashi, M. Katayama, Ø. Dahl, J. K. Grepstad, Y. Matsumoto, and T. Tybell, Appl. Phys. Lett. 94, 232901 (2009)

The fabrication and chracterization of PbTiO3 nanomesas realized on nanostructured SrRuO3/SrTiO3 templates,
C.C.You, R. Takahashi, A. Borg, J.K. Grepstad and T. Tybell, Nanotechnology 20, 255705 (2009)

Sputter-deposited (Pb,La) (Zr,Ti)O3 thin films: Effect of substrate and optical properties,
Ø. Nordseth, T. Tybell, J. K. Grepstad, and A. Røyset, J. Vac. Sci. and Tech. 27, 548 (2009)

Epitaxial (Pb,La)(Zr,Ti)O3 thin films on buffered Si(100) by on-axis rf magnetron sputtering,
Ø. Nordseth, T. Tybell, and J.K. Grepstad, Thin Solid Films 517, 2623 (2009)

Comparison of TEM specimen preparation of perovskite thin films by tripod polishing and conventional ion milling,
E. Eberg, Å.F. Monsen, T. Tybell, A.T.J. van Helvoort and R. Holmestad, Journal of electron microscopy 57 , 6, 175-179 (2008)

The Ferroic Phase Transitions of BiFeO3,
S.M.Selbach, T. Tybell, M.-A. Einarsrud and T. Grande, Adv. Mat. 20, 3692 (2008)

Ferroelectric stripe domains in PbTiO3 thin films: Depolarization field and domain randomness,
R. Takahashi, Ø. Dahl, E. Eberg, J.K. Grepstad and T. Tybell, J. Appl. Phys. 104, 064109 (2008)

Crystalline and dielectric properties of sputter deposited PbTiO3 thin films,
Ø. Dahl, J.K. Grepstad and T. Tybell, J. Appl. Phys. 103, 114112 (2008)

PbTiO3 nanorod arrays grown by self-assembly of nanocrystals,
P.M. Rørvik, Å. Almli, A.T.J. van Helvoort, R. Holmestad, T. Tybell, T. Grande and M. Einarsrud, Nanotechnology 19, 225605 (2008)

Photochemical switching of ultrathin PbTiO3 films,
R. Takahashi, J.K. Grepstad, T. Tybell and Y. Matsumoto, Appl. Phys. Lett. 92, 112901 (2008)

Size dependent properties of nanocrystalline BiFeO3 particles,
M. Selbach, T. Tybell, M.-A. Einarsrud, and T. Grande, Chem. Mater. 19, 6478 (2007)

Synthesis of BiFeO3 by wet chemical methods,
M. Selbach, M.-A. Einarsrud, T. Tybell and T. Grande,  J. Am. Ceram. Soc., 90, 3430 (2007)

Nanoscale structuring of SrRuO3 thin film surfaces by scanning tunneling microscopy,
C.C. You, N-V. Rystad, A. Borg and T. Tybell, Applied Surface Science 253, 4704 (2007)

Formation and electronic properties of oxygen annealed Au/Ni and Pt/Ni contacts to p-type GaN,
S. V. Pettersen, A. P. Grande, T. Tybell, H. Riechert, R. Averbeck and J. K. Grepstad, Semicond. Sci. Technol., 22, 186-193 (2007)

Nanoscale studies of domain wall motion in epitaxial ferroelectric thin films,
P. Paruch, T. Giamarchi, T. Tybell and J.-M. Triscone, J. App. Phys., 100, 051608 (2006)

Thickness-dependent properties of (110)-oriented La1.2Sr1.8Mn2O7thin films,
Y. Takamura, R.V. Chopdekar, J. Grepstad, Y. Suzuki, A.F. Marshall, A. Vailionis, H. Zheng, and J.F.  Mitchell, J. of Appl. Phys. 99, 085902 (2006)

Imaging of out-of-plane interfacial strain in epitaxial PbTiO3/SrTiO3 thin films,
A.T.J. van Helvoort, Ø. Dahl, B.G. Soleim, R. Holmestad, and T. Tybell, Appl. Phys. Lett. 86, 92907 (2005).

Structural, Magnetic, and Electronic Properties of (110)-Oriented Epitaxial Thin Films of the Bilayer Manganite La1.2Sr1.8Mn2O7,
Y. Takamura, J. Grepstad, R.V. Chopdekar, A.F. Marshall, H. Zheng, J.F. Mitchell, and Y. Suzuki, Appl. Phys. Lett. 87, 142508 (2005)

Characterization of crystalline Pb0.92La0.08Zr0.4Ti0.6O3 thin films grown by off-axis radio frequency magnetron sputtering,
A. K. Sarin Kumar, Ø. Dahl, S. V. Pettersen, J. K. Grepstad, and T. Tybell, Thin Solid Films 492, 71 (2005)

Effects of thermal annealing in oxygen on the antiferromagnetic order and domain structure of epitaxial LaFeO3 thin films,
Jostein K. Grepstad, Yayoi Takamura, Andreas Scholl, Ingebrigt Hole, Yuri Suzuki, and Thomas Tybell, Thin Solid Films 486, 108 (2005)

A High Frequency Surface Acoustic Wave Device Based on Thin Film Piezoelectric Interdigital Transducers,
A. K. Sarin Kumar, P. Paruch and J.-M. Triscone, W. Daniau and S. Ballandras, L. Pellegrino, D. Marré, and T. Tybell, Appl. Phys. Lett., 85, 1757 (2004).

A novel high frequency surface acoustic wave device based on piezoelectric interdigital transducers,
A.K. Kumar, P. Paruch, D. Marre, L. Pellegrino, T. Tybell, S. Ballandras, J.-M. Triscone, Integrated Ferroelectrics 63, 567 (2004)

High temperature transport kinetics in heteroepitaxial LaFeO3 thin films,
I. Hole, T. Tybell, J.K. Grepstad, I. Wærnhus, T. Grande, and K. Wiik. Solid-State Electronics 47, 2279 (2003).

NTNU, NO-7491 Trondheim. Telephone: +47 73 59 50 00. Contact us
Editorial responsibility: Director of Information Christian Fossen