pyhdf.V | index /usr/lib/python2.4/site-packages/pyhdf/V.py |
A module of the pyhdf package implementing the V (Vgroup)
API of the NCSA HDF4 library.
(see: hdf.ncsa.uiuc.edu)
Author: Andre Gosselin
Maurice-Lamontagne Institute
gosselina@dfo-mpo.gc.ca
Version: 0.7-3
Date: july 13 2005
Table of contents
-----------------
Introduction
Accessing the V module
Package components
Prerequisites
Summary of differences between the pyhdf and C V API
Error handling
V needs support from the HDF module
Classes summary
Attribute access: low and high level
Predefined attributes
Programming models
Module documentation
Introduction
------------
V is one of the modules composing pyhdf, a python package implementing
the NCSA HDF library and letting one manage HDF files from within a python
program. Two versions of the HDF library currently exist, version 4 and
version 5. pyhdf only implements version 4 of the library. Many
different APIs are to be found inside the HDF4 specification.
Currently, pyhdf implements just a few of those: the SD, VS and V APIs.
Other APIs should be added in the future (GR, AN, etc).
The V API supports the definition of vgroups inside an HDF file. A vgroup
can thought of as a collection of arbitrary "references" to other HDF
objects defined in the same file. A vgroup may hold references to
other vgroups. It is thus possible to organize HDF objects into some sort
of a hierarchy, similar to files grouped into a directory tree under unix.
This vgroup hierarchical nature partly explains the origin of the "HDF"
name (Hierarchical Data Format). vgroups can help logically organize the
contents of an HDF file, for example by grouping together all the datasets
belonging to a given experiment, and subdividing those datasets according
to the day of the experiment, etc.
The V API provides functions to find and access an existing vgroup,
create a new one, delete a vgroup, identify the members of a vgroup, add
and remove members to and from a vgroup, and set and query attributes
on a vgroup. The members of a vgroup are identified through their tags
and reference numbers. Tags are constants identifying each main object type
(dataset, vdata, vgroup). Reference numbers serve to distinguish among
objects of the same type. To add an object to a vgroup, one must first
initialize that object using the API proper to that object type (eg: SD for
a dataset) so as to create a reference number for that object, and then
pass this reference number and the type tag to the V API. When reading the
contents of a vgroup, the V API returns the tags and reference numbers of
the objects composing the vgroup. The user program must then call the
proper API to process each object, based on tag of this object (eg: VS for
a tag identifying a vdata object).
Some limitations of the V API must be stressed. First, HDF imposes
no integrity constraint whatsoever on the contents of a vgroup, nor does it
help maintain such integrity. For example, a vgroup is not strictly
hierarchical, because an object can belong to more than one vgroup. It would
be easy to create vgroups showing cycles among their members. Also, a vgroup
member is simply a reference to an HDF object. If this object is afterwards
deleted for any reason, the vgroup membership will not be automatically
updated. The vgroup will refer to a non-existent object and thus be left
in an inconsistent state. Nothing prevents adding the same member more than
once to a vgroup, and giving the same name to more than one vgroup.
Finally, the HDF library seems to make heavy use of vgroups for its own
internal needs, and creates vgroups "behind the scenes". This may make it
difficult to pick up "user defined" vgroups when browsing an HDF file.
Accessing the V module
-----------------------
To access the V module a python program can say one of:
>>> import pyhdf.V # must prefix names with "pyhdf.V."
>>> from pyhdf import V # must prefix names with "V."
>>> from pyhdf.V import * # names need no prefix
This document assumes the last import style is used.
V is not self-contained, and needs functionnality provided by another
pyhdf module, namely the HDF module. This module must thus be imported
also:
>>> from HDF import *
Package components
------------------
pyhdf is a proper Python package, eg a collection of modules stored under
a directory whose name is that of the package and which stores an
__init__.py file. Following the normal installation procedure, this
directory will be <python-lib>/site-packages/pyhdf', where <python-lib>
stands for the python installation directory.
For each HDF API exists a corresponding set of modules.
The following modules are related to the V API.
_hdfext C extension module responsible for wrapping the HDF
C library for all python modules
hdfext python module implementing some utility functions
complementing the _hdfext extension module
error defines the HDF4Error exception
HDF python module providing support to the V module
V python module wrapping the V API routines inside
an OOP framework
_hdfext and hdfext were generated using the SWIG preprocessor.
SWIG is however *not* needed to run the package. Those two modules
are meant to do their work in the background, and should never be called
directly. Only HDF and V should be imported by the user program.
Prerequisites
-------------
The following software must be installed in order for the V module to
work.
HDF (v4) library
pyhdf does *not* include the HDF4 library, which must
be installed separately.
HDF is available at:
"http://hdf.ncsa.uiuc.edu/obtain.html".
Numeric is also needed by the SD module. See the SD module documentation.
Summary of differences between the pyhdf and C V API
-----------------------------------------------------
Most of the differences between the pyhdf and C V API can
be summarized as follows.
-In the C API, every function returns an integer status code, and values
computed by the function are returned through one or more pointers
passed as arguments.
-In pyhdf, error statuses are returned through the Python exception
mechanism, and values are returned as the method result. When the
C API specifies that multiple values are returned, pyhdf returns a
sequence of values, which are ordered similarly to the pointers in the
C function argument list.
Error handling
--------------
All errors reported by the C V API with a SUCCESS/FAIL error code
are reported by pyhdf using the Python exception mechanism.
When the C library reports a FAIL status, pyhdf raises an HDF4Error
exception (a subclass of Exception) with a descriptive message.
Unfortunately, the C library is rarely informative about the cause of
the error. pyhdf does its best to try to document the error, but most
of the time cannot do more than saying "execution error".
V needs support from the HDF module
------------------------------------
The VS module is not self-contained (countrary to the SD module).
It requires help from the HDF module, namely:
-the HDF.HDF class to open and close the HDF file, and initialize the
V interface
-the HDF.HC class to provide different sorts of constants (opening modes,
data types, etc).
A program wanting to access HDF vgroups will almost always need to execute
the following minimal set of calls:
>>> from pyhdf.HDF import *
>>> from pyhdf.V import *
>>> hdfFile = HDF(name, HC.xxx)# open HDF file
>>> v = hdfFile.vgstart() # initialize V interface on HDF file
>>> ... # manipulate vgroups
>>> v.end() # terminate V interface
>>> hdfFile.close() # close HDF file
Classes summary
---------------
pyhdf wraps the V API using the following python classes:
V HDF V interface
VG vgroup
VGAttr vgroup attribute
In more detail:
V The V class implements the V (Vgroup) interface applied to an
HDF file.
To instantiate a V class, call the vgstart() method of an
HDF instance.
methods:
constructors
attach() open an existing vgroup given its name or its
reference number, or create a new vgroup,
returning a VG instance for that vgroup
create() create a new vgroup, returning a VG instance
for that vgroup
closing the interface
end() close the V interface on the HDF file
deleting a vgroup
delete() delete the vgroup identified by its name or
its reference number
searching
find() find a vgroup given its name, returning
the vgroup reference number
findclass() find a vgroup given its class name, returning
the vgroup reference number
getid() return the reference number of the vgroup
following the one with the given reference number
VG The VG class encapsulates the functionnality of a vgroup.
To instantiate a VG class, call the attach() or create() methods
of a V class instance.
constructors
attr() return a VGAttr instance representing an attribute
of the vgroup
findattr() search the vgroup for a given attribute,
returning a VGAttr instance for that attribute
ending access to a vgroup
detach() terminate access to the vgroup
adding a member to a vgroup
add() add to the vgroup the HDF object identified by its
tag and reference number
insert() insert a vdata or a vgroup in the vgroup, given
the vdata or vgroup instance
deleting a member from a vgroup
delete() remove from the vgroup the HDF object identified
by the given tag and reference number
querying vgroup
attrinfo() return info about all the vgroup attributes
inqtagref() determine if the HDF object with the given
tag and reference number belongs to the vgroup
isvg() determine if the member with the given reference
number is a vgroup object
isvs() determine if the member with the given reference
number is a vdata object
nrefs() return the number of vgroup members with the
given tag
tagref() get the tag and reference number of a vgroup
member, given the index number of that member
tagrefs() get the tags and reference numbers of all the
vgroup members
VGAttr The VGAttr class provides methods to set and query vgroup
attributes.
To create an instance of this class, call the attr() method
of a VG instance.
Remember that vgroup attributes can also be set and queried by
applying the standard python "dot notation" on a VG instance.
get attibute value(s)
get() obtain the attribute value(s)
set attribute value(s)
set() set the attribute to the given value(s) of the
given type, first creating the attribute if
necessary
query attribute info
info() retrieve attribute name, data type, order and
size
Attribute access: low and high level
------------------------------------
The V API allows setting attributes on vgroups. Attributes can be of many
types (int, float, char) of different bit lengths (8, 16, 32, 64 bits),
and can be single or multi-valued. Values of a multi-valued attribute must
all be of the same type.
Attributes can be set and queried in two different ways. First, given a
VG instance (describing a vgroup object), the attr() method of that instance
is called to create a VGAttr instance representing the wanted attribute
(possibly non existent). The set() method of this VGAttr instance is then
called to define the attribute value, creating it if it does not already
exist. The get() method returns the current attribute value. Here is an
example.
>>> from pyhdf.HDF import *
>>> from pyhdf.V import *
>>> f = HDF('test.hdf', HC.WRITE) # Open file 'test.hdf' in write mode
>>> v = f.vgstart() # init vgroup interface
>>> vg = v.attach('vtest', 1) # attach vgroup 'vtest' in write mode
>>> attr = vg.attr('version') # prepare to define the 'version' attribute
# on the vdata
>>> attr.set(HC.CHAR8,'1.0') # set attribute 'version' to string '1.0'
>>> print attr.get() # get and print attribute value
>>> attr = vg .attr('range') # prepare to define attribute 'range'
>>> attr.set(HC.INT32,(-10, 15)) # set attribute 'range' to a pair of ints
>>> print attr.get() # get and print attribute value
>>> vg.detach() # "close" the vgroup
>>> v.end() # terminate the vgroup interface
>>> f.close() # close the HDF file
The second way consists of setting/querying an attribute as if it were a
normal python class attribute, using the usual dot notation. Above example
then becomes:
>>> from pyhdf.HDF import *
>>> from pyhdf.V import *
>>> f = HDF('test.hdf', HC.WRITE) # Open file 'test.hdf' in write mode
>>> v = f.vgstart() # init vgroup interface
>>> vg = v.attach('vtest', 1) # attach vdata 'vtest' in write mode
>>> vg.version = '1.0' # create vdata attribute 'version',
# setting it to string '1.0'
>>> print vg.version # print attribute value
>>> vg.range = (-10, 15) # create attribute 'range', setting
# it to the pair of ints (-10, 15)
>>> print vg.range # print attribute value
>>> vg.detach() # "close" the vdata
>>> v.end() # terminate the vdata interface
>>> f.close() # close the HDF file
Note how the dot notation greatly simplifies and clarifies the code.
Some latitude is however lost by manipulating attributes in that way,
because the pyhdf package, not the programmer, is then responsible of
setting the attribute type. The attribute type is chosen to be one of:
HC.CHAR8 if the attribute value is a string
HC.INT32 if all attribute values are integers
HC.FLOAT64 otherwise
The first way of handling attribute values must be used if one wants to
define an attribute of any other type (for ex. 8 or 16 bit integers,
signed or unsigned). Also, only a VDAttr instance gives access to attribute
info, through its info() method.
However, accessing HDF attributes as if they were python attributes raises
an important issue. There must exist a way to assign generic attributes
to the python objects without requiring those attributes to be converted
to HDF attributes. pyhdf uses the following rule: an attribute whose name
starts with an underscore ('_') is either a "predefined" HDF attribute
(see below) or a standard python attribute. Otherwise, the attribute
is handled as an HDF attribute. Also, HDF attributes are not stored inside
the object dictionnary: the python dir() function will not list them.
Attribute values can be updated, but it is illegal to try to change the
value type, or the attribute order (number of values). This is important
for attributes holding string values. An attribute initialized with an
'n' character string is simply a character attribute of order 'n' (eg a
character array of length 'n'). If 'vg' is a vgroup and we initialize its
'a1' attribute as 'vg.a1 = "abcdef"', then a subsequent update attempt
like 'vg.a1 = "12"' will fail, because we then try to change the order
of the attribute (from 6 to 2). It is mandatory to keep the length of string
attributes constant.
Predefined attributes
---------------------
The VG class supports predefined attributes to get (and occasionnaly set)
attribute values easily using the usual python "dot notation", without
having to call a class method. The names of predefined attributes all start
with an underscore ('_').
In the following table, the RW column holds an X if the attribute
is read/write.
VG predefined attributes
name RW description C library routine
--------------------------------------------------------------
_class X class name Vgetclass/Vsetclass
_name X vgroup name Vgetname/Vsetname
_nattrs number of vgroup attributes Vnattrs
_nmembers number of vgroup members Vntagrefs
_refnum vgroup reference number VQueryref
_tag vgroup tag VQuerytag
_version vgroup version number Vgetversion
Programming models
------------------
Creating and initializing a vgroup
----------------------------------
The following program shows how to create and initialize a vgroup inside
an HDF file. It can serve as a model for any program wanting to create
a vgroup.
from pyhdf.HDF import *
from pyhdf.V import *
from pyhdf.VS import *
from pyhdf.SD import *
def vdatacreate(vs, name):
# Create vdata and define its structure
vd = vs.create(name,
(('partid',HC.CHAR8, 5), # 5 char string
('description',HC.CHAR8, 10), # 10 char string field
('qty',HC.INT16, 1), # 1 16 bit int field
('wght',HC.FLOAT32, 1), # 1 32 bit float
('price',HC.FLOAT32,1) # 1 32 bit float
))
# Store records
vd.write((('Q1234', 'bolt',12, 0.01, 0.05), # record 1
('B5432', 'brush', 10, 0.4, 4.25), # record 2
('S7613', 'scissor', 2, 0.2, 3.75) # record 3
))
# "close" vdata
vd.detach()
def sdscreate(sd, name):
# Create a simple 3x3 float array.
sds = sd.create(name, SDC.FLOAT32, (3,3))
# Initialize array
sds[:] = ((0,1,2),(3,4,5),(6,7,8))
# "close" dataset.
sds.endaccess()
# Create HDF file
filename = 'inventory.hdf'
hdf = HDF(filename, HC.WRITE|HC.CREATE)
# Initialize the SD, V and VS interfaces on the file.
sd = SD(filename, SDC.WRITE) # SD interface
vs = hdf.vstart() # vdata interface
v = hdf.vgstart() # vgroup interface
# Create vdata named 'INVENTORY'.
vdatacreate(vs, 'INVENTORY')
# Create dataset named "ARR_3x3"
sdscreate(sd, 'ARR_3x3')
# Attach the vdata and the dataset.
vd = vs.attach('INVENTORY')
sds = sd.select('ARR_3x3')
# Create vgroup named 'TOTAL'.
vg = v.create('TOTAL')
# Add vdata to the vgroup
vg.insert(vd)
# We could also have written this:
# vgroup.add(vd._tag, vd._refnum)
# or this:
# vgroup.add(HC.DFTAG_VH, vd._refnum)
# Add dataset to the vgroup
vg.add(HC.DFTAG_NDG, sds.ref())
# Close vgroup, vdata and dataset.
vg.detach() # vgroup
vd.detach() # vdata
sds.endaccess() # dataset
# Terminate V, VS and SD interfaces.
v.end() # V interface
vs.end() # VS interface
sd.end() # SD interface
# Close HDF file.
hdf.close()
The program starts by defining two functions vdatacreate() and sdscreate(),
which will serve to create the vdata and dataset objects we need. Those
functions are not essential to the example. They simply help to make the
example self-contained. Refer to the VS and SD module documentation for
additional explanations about how these functions work.
After opening the HDF file in write mode, the SD, V and VS interfaces are
initialized on the file. Next vdatacreate() is called to create a new vdata
named 'INVENTORY' on the VS instance, and sdscreate() to create a new
dataset named 'ARR_3x3' on the SD instance. This is done so that we have a
vdata and a dataset to play with.
The vdata and the dataset are then attached ("opened"). The create()
method of the V instance is then called to create a new vgroup named
'TOTAL'. The vgroup is then populated by calling its insert() method to add
the vdata 'INVENTORY', and its add() method to add the 'ARR_3x3' dataset.
Note that insert() is just a commodity method that simplifies adding a
vdata or a vgroup to a vgroup, avoiding the need to pass an object tag and
reference number. There is no such commodity method for adding a dataset
to a vgroup. The dataset must be added by specifying its tag and reference
number. Note that the tags to be used are defined inside the HDF module as
constants of the HC class: DFTAG_NDG for a dataset, DFTAG_VG for a vgroup,
DFTAG_VH for a vdata.
The program ends by detaching ("closing") the HDF objects created above,
terminating the three interfaces initialized, and closing the HDF file.
Reading a vgroup
----------------
The following program shows the contents of the vgroups contained inside
any HDF file.
from pyhdf.HDF import *
from pyhdf.V import *
from pyhdf.VS import *
from pyhdf.SD import *
import sys
def describevg(refnum):
# Describe the vgroup with the given refnum.
# Open vgroup in read mode.
vg = v.attach(refnum)
print "----------------"
print "name:", vg._name, "class:",vg._class, "tag,ref:",
print vg._tag, vg._refnum
# Show the number of members of each main object type.
print "members: ", vg._nmembers,
print "datasets:", vg.nrefs(HC.DFTAG_NDG),
print "vdatas: ", vg.nrefs(HC.DFTAG_VH),
print "vgroups: ", vg.nrefs(HC.DFTAG_VG)
# Read the contents of the vgroup.
members = vg.tagrefs()
# Display info about each member.
index = -1
for tag, ref in members:
index += 1
print "member index", index
# Vdata tag
if tag == HC.DFTAG_VH:
vd = vs.attach(ref)
nrecs, intmode, fields, size, name = vd.inquire()
print " vdata:",name, "tag,ref:",tag, ref
print " fields:",fields
print " nrecs:",nrecs
vd.detach()
# SDS tag
elif tag == HC.DFTAG_NDG:
sds = sd.select(sd.reftoindex(ref))
name, rank, dims, type, nattrs = sds.info()
print " dataset:",name, "tag,ref:", tag, ref
print " dims:",dims
print " type:",type
sds.endaccess()
# VS tag
elif tag == HC.DFTAG_VG:
vg0 = v.attach(ref)
print " vgroup:", vg0._name, "tag,ref:", tag, ref
vg0.detach()
# Unhandled tag
else:
print "unhandled tag,ref",tag,ref
# Close vgroup
vg.detach()
# Open HDF file in readonly mode.
filename = sys.argv[1]
hdf = HDF(filename)
# Initialize the SD, V and VS interfaces on the file.
sd = SD(filename)
vs = hdf.vstart()
v = hdf.vgstart()
# Scan all vgroups in the file.
ref = -1
while 1:
try:
ref = v.getid(ref)
except HDF4Error,msg: # no more vgroup
break
describevg(ref)
# Terminate V, VS and SD interfaces.
v.end()
vs.end()
sd.end()
# Close HDF file.
hdf.close()
The program starts by defining function describevg(), which is passed the
reference number of the vgroup to display. The function assumes that the
SD, VS and V interfaces have been previously initialized.
The function starts by attaching ("opening") the vgroup, and displaying
its name, class, tag and reference number. The number of members of the
three most important object types is then displayed, by calling the nrefs()
method with the predefined tags found inside the HDF.HC class.
The tagrefs() method is then called to get a list of all the vgroup members,
each member being identified by its tag and reference number. A 'for'
statement is entered to loop over each element of this list. The tag is
tested against the known values defined in the HDF.HC class: the outcome of
this test indicates how to process the member object.
A DFTAG_VH tag indicates we deal with a vdata. The vdata is attached, its
inquire() method called to display info about it, and the vdata is detached.
In the case of a DFTAG_NFG, we are facing a dataset. The dataset is
selected, info is obtained by calling the dataset info() method, and the
dataset is released. A DFTAG_VG indicates that the member is a vgroup. We
attach it, print its name, tag and reference number, then detach the
member vgroup. A warning is finally displayed if we hit upon a member of
an unknown type.
The function releases the vgroup just displayed and returns.
The main program starts by opening in readonly mode the HDF file passed
as argument on the command line. The SD, VS and V interfaces are
initialized, and the corresponding class instances are stored inside 'sd',
'vs' and 'v' global variables, respectively, for the use of the
describevg() function.
A while loop is then entered to access each vgroup in the file. A reference
number of -1 is passed on the first call to getid() to obtain the reference
number of the first vgroup. getid() returns a new reference number on each
subsequent call, and raises an exception when the last vgroup has been
retrieved. This exception is caught to break out of the loop, otherwise
describevg() is called to display the vgroup we have on hand.
Once the loop is over, the interfaces initialized before are terminated,
and the HDF file is closed.
You will notice that this program will display vgroups other than those
you have explicitly created. Those supplementary vgroups are created
by the HDF library for its own internal needs.
Modules | ||||||
|
Classes | ||||||||||||||||||||||||||
|
Data | ||
__all__ = ['V', 'VG', 'VGAttr'] |