Oct. 13, 1999

R. Millan-Gabet

PICNIC Camera Control
=====================

These notes are intended to get us (me) started thinking about what
approach to use for reading out future IOTA IR cameras that use the
basic design of the NICMOS3 camera and a PICNIC array.

Background
**********

Recall that to operate an array such as the NICMOS3 or PICNIC we require:

1) Set of analog electronics, responsible for:

   1-a) Sensing the detector output and amplifying it.
   1-b) Adding a voltage bias to the analog signal.
   1-c) Providing power and bias voltage levels to the detector array.
   1-d) Providing analog and digital power to the various parts of the system.

2) Set of digital electronics, responsible for:

   2-a) Generating and transferring TTL levels necessary to address and
        reset the array pixels, and calibrate the ADC and trigger
        conversions.     
   2-b) Providing logic for adequate timing of ADC sampling.
   2-c) Read the ADC output into a computer.
	
Experience with the NICMOS3 camera has demonstrated that we are happy
with the basic design by Ta-Chun Li (UMass), in terms of functionality
and noise performance. Therefore, the set of electronics can be
basically copied in future systems, except for minor modifications
that are required to accommodate a PICNIC chip instead of a NICMOS3,
and other modifications that resulted from experience with the first
system, both of which are documented elsewhere.

What is in question in these notes is only items (2-a) and (2-c): how
the TTL patterns that perform the pixel addressing and sampling
functions are generated, and how the ADC output is read.

In that respect, for the NICMOS3 system the approach followed was to
generate those patterns in a fast PC, using static programmed I/O and
an AT interface board (the AT-DIO-32F, from national Instruments). The
readout sequence is achieved in a software loop that continuously
addresses, samples and reads the data from the target pixels (2). The
software loop contains the code that achieves this for a single sample
per pixel, and the number of loop iterations equals the number of data
points in a scan. The constancy of the integration time from sample to
sample relies on the constancy of the software loop execution time. It
was found that when this software runs under Windows (3.1), the signal
was contaminated by large spikes, but this problem disappears when the
software runs under DOS. In a general sense, because we don't have
detailed understanding or testing, we attribute this to DOS being a
better approximation to a real time operating system (RTOS) than
Windows.

For reference, the frame rate we achieve using this method can be
quantified by knowing that it allows to do digital I/O at about 400
Kwords/s. The other numbers that affect the frame rate (but have
nothing to do with the issue we are discussing here) are the pixel
settling time (about 10 usec) and the AD conversion time (10
usec). Allowing for some extra needed overhead operations, this
results in our minimum frame time of 0.18 msec, for reading two pixels
near (row,column)=(32,32) once.

Also, in the NICMOS3 system, a second AT-DIO-32F is used to transfer
the data acquired to the IOTA control computer (a Quadra), but I
consider this a separate topic since the data is transfered on a
scan-by-scan basis, with no impact on the timing during data
acquisition. We will need a similar data transfer mechanism in future
cameras, but many options exist that easily satisfy our bandwidth
requirements.

The AT-DIO-32F comes with software NI-DAQ version 4.8 (drivers and
function libraries) that support applications for DOS and Windows
3.1. It also comes with full hardware documentation that allows to do
register level programming (RLP) instead of using the NI-DAQ
functions. We chose the RLP option. 

The Problem
***********

The method described above for controlling the NICMOS3/PICNIC readout
has become obsolete because the AT-DIO-32F board has been replaced
with the AT-DIO-32HS or PCI-DIO-32HS, and these new boards are not
compatible with v.4.8 of the NI-DAQ software, which was the latest
version to support DOS applications. The new software is NI-DAQ
version 6.1, which only supports applications in Windows
95/98/NT. Moreover, N.I. has dropped "standard" support of RLP,
although "beta" versions of hardware manuals with bit-by-bit
descriptions of configuration and data registers are available upon
request.

So, What Do We Do?
*****************

To be completely practical, let's start by recalling that we own 2
spare AT-DIO-32F boards, therefore it *is* possible to simply copy the
existing system once, or twice if another method is used to transfer
the data to the main computer.

However, it is probably a bad idea to depend on hardware that can no
longer be purchased, and on software that is not supported by the new,
available hardware. Therefore, beyond solving our immediate problem,
perhaps we should take this as an opportunity to consider new and
better options, also keeping in mind new constraints such as being
able to adequately sample more than 2 pixels for operation with 3
telescopes and with FLUOR, and having an easy interface to the future
IOTA control system, as well as those of FLOUR and CHARA. 

I can think of several options for a new approach to items (2-a,c)
that I list and briefly comment on below. In deciding between those
options, I would say that the criterion should be to have, as soon as
possible, a system that satisfies our bandwidth requirements (with 2
and 3 telescopes, and for classical and FLUOR beam combination), and
integrates well in the overall control system of IOTA and/or CHARA.


Option 1) Do as before, but using DIO-32HS boards and RLP
          ***********************************************

I have obtained the beta RLP manuals from N.I., therefore we could try
to figure this out and program these boards at a low level ourselves.
If we do this, we might as well make use of the PCI version of this
board, instead of the AT, which should result in faster digital I/O
(by how much? I don't know yet, I find the specs unclear on this
point, need to talk to NI). (note: we own 2 of the AT-DIO-32HS, none
of the PCI-DIO-32HS).

* Advantage: This would be an application that will look a lot like the
	     current system, except that we would be using the current
	     generation I/O boards. The software needs to change only
	     in the part that interacts with I/O board, and will still
	     run under DOS for adequate "real time" performance.

* Disadvantage: Are the "non supported" RLP manuals reliable?  

Option 2) DIO-32HS under Windows 95 using Pattern Generation
          **************************************************

Here we would attempt to solve the problem we had running our software
under Windows by making use of more hardware-oriented methods for
generating digital output/input using the NI boards. In a transfer
mode called "pattern generation" the digital patterns needed for a
readout sequence are loaded in a buffer and, by proper configuration
of the board, they get output in sequence, one word every time a
handshaking line is asserted by an internal clock. These transfers use
DMA and therefore do not involve the computer CPU, and the regularity
of the digital output is guaranteed by the (one way) handshaking with
the hardware clock. Because our readout sequence involves reading in
data as well as sending out patterns, probably this option would
involve two similar processes running in parallel, in two boards or in
two groups of ports of the same board, with a lot of the difficulty
residing in how to synchronize pixel sampling and data input. In order
to ensure that no CPU activity takes place during data taking, this
should be set up so that the DMA output/input takes care of an entire
scan, as opposed to on a sample-by-sample basis.

Note: This mode is also available on the AT-DIO 32F, so it could also
be considered for option (1), for even better performance.

* Advantage: If this does in fact result in stable timing, it would
	     allow us to continue using the NI boards, with which we
	     have some familiarity now, and to work under the very common
	     Windows environment. Because we would now be developing
	     a Windows  application, we would be able to use the
	     NI-DAQ software, and save some programming time by
	     avoiding the RLP details.

* Disadvantage: If the NI-DAQ software is not flexible enough to
		accommodate our exact needs, we would have to resort to
		RLP anyways, and in that case there would be no gain with
		respect to option (1).

Option 3) Use a RTOS
          **********

Here we take a whole new direction. In a RTOS, a dedicated CPU (called
the target) runs the timing-critical code. A host computer is used to
develop this code and load it into the target CPU, and to communicate
with it for low bandwidth operations. This is guaranteed to satisfy
our requirements, and the tools would be state of the art.

Three sub-options come to mind:

3-1) RT Labview & RT DAQ board:
     -------------------------
 
This is a new line of products from N.I. One needs to buy software (RT
Labview) and hardware (RT DAQ). It turns out that one of the RT DAQ
boards is essentially the DIO-32HS, but with the dedicated CPU
on-board. According to specs, this should do digital I/O about 40
times faster than our current system.
   
These are well supported products, and we know a lot of people who are
happy with LabView. These products also tend to be designed in a
fairly developer-friendly way, and the programming environment (PC +
Windows) is a familiar one. Therefore, even though this is new
territory for us, it could be relatively easy to learn.

The disadvantage is that, except with respect to FLOUR, this
introduces a total stranger as far as control systems at IOTA and
CHARA is concerned. Also, the cost is high (about 7,000.00$).

3-2) VxWorks and VME I/O Board or SBC:
     --------------------------------
   
Same idea, but using that other software & hardware. The advantage is
that this is also the platform chosen for the new IOTA control
system. The disadvantage is that the learning curve on this very
sophisticated system is likely to be considerably longer, and this
RTOS probably is big overkill for such a simple application.

I need to find out whether we could get a version of the development
tools that runs in a PC as the host (so that we don't have to buy a
Sun station to develop this system here at CfA) under the license that
UMass bought, and if not, how much it would cost to get that extra
license and the hardware.

3-3) RT Linux and DIO-32HS board:
     ---------------------------

Again, the same idea. RT Linux is good and cheap, and would overlap
nicely with existing CHARA software; but not so much with IOTA. The
difficulty is that we would have to write a device driver for the
DIO32HS board, using the RLP documentation; or can we find on the web
a driver that someone has already written ?

Option 4) Microcontroller
          ***************

The distinction between a microcontroller and a RTOS seems to me like
it should be a small one for this kind of application, where the
dedicated CPU will run a single task - reading out the camera. The
difference is, I believe, that in a microcontroller there is no OS at
all, just the executable code running the application, and in that
sense this solution is a better match to our needs (we *don't* need an
OS). The development is similar in the sense that a "host" computer is
used to write the code, which gets downloaded into the microcontroller
CPU. This is done using software tools and a compiler (C for example)
provided with the hardware. Some other IOTA control computer would
communicate with the microcontroller to set readout parameters and
display/store the scan data. The cost of these systems is low (few
hundred dollars). The problems with this solution might be that (1)
the microcontrollers available in the market may not be fast enough
for our needs, and (2) that they may not provide enough digital I/O
capabilities (we need at least an 8 bit port for output and a 17 bit
port for input). The market needs to be researched to find out what is
available. 
 
Option 5) All Hardware
          ************

The digital patterns that need to be generated to address and sample
the pixels can easily be generated in hardware, in a similar way as in
the CCD cameras made for IOTA at SAO. The general idea is that a
oscillator drives a counter, the output of which addresses memory
locations of an EEPROM that has been preloaded with the correct
sequence of bit patterns. It is less clear to me how the data input is
done, perhaps also in hardware using the EOC bit and some digital
logic to store the data in FIFO followed by DMA transfer into the PC
using an I/O board, or using the EOC bit to trigger interrupts.

This would certainly result in the fastest readout time (limited by the
frequency of the oscillator). It is also elegant because it involves
only hardware for the time critical tasks. The disadvantages are that
it probably involves considerable development effort, and that burning
the digital patterns in EEPROM removes a lot of the flexibility we
have when we choose among readout modes and target pixels in software.


Conclusions
***********

None yet. I would like to find a software solution, so Option (5) is
the one I like least. I would also like to find a solution that is
well matched to the task, and doesn't use more computer and software
resources than are really needed. For this reason Options (1) and (2)
don't seem particularly attractive since they still involve using a
whole PC and either DOS (Option 1) or Windows (Option 2), for a task
that doesn't even require an OS at all (recall that in these options,
as in the current method used, the PC doesn't even process or display
the data in any way, it transfers it to another computer for those
higher level tasks). In that respect, I like the microcontroller or
RTOS options. A microcontroller is small and cheap and would
essentially do what the NICMOS3 Pentium does now but in a discrete
little box and with no OS. A RTOS running in a dedicated board still
adds a whole computer into the scheme, the host computer, but at least
we gain something in the sense that the timing sensitive code is
running in real time in the target CPU, and the host is available to
do other things such as data process and display. In that category,
the RT LabView option is attractive, except for the price, because it
is likely to be the fastest to learn; and the VxWorks option is
attractive for compatibility and future integration with the overall
system.