Innovation in numerical analysis

Innovation does not stop in the field of numerical analysis, which lies at the heart of industrial applications of numerical simulation and modeling.

In this regards the recent paper “Multifrontal Factorization of Sparse SPD Matrices on GPUs (ACM)” by T George, V Saxena, A Gupta, A Singh and A R Choudhury in the IPDPS ’11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium is very interesting. The authors report a seven-fold speed-up for their coupled single-core CPU+GPU hybrid implementation w.r.t. a state-of-the-art single-core CPU implementation. Speedups were the even higher (10 to 25 w.r.t. single-core CPU) with 2-cores CPU + 2 GPU. The proposed methods apply only to symmetric positive definite (SPD) sparse matrices, such as those found in FEM methods and the like; to apply these techniques to the solution of the large-scale systems of non-linear equations and differential-algebraic equations that occur in steady-state and dynamical process simulation would require adapting them to non-symmetric multifrontal solvers.

Unfortunately to successfully pursue innovation in these areas requires a challenging mix of competences in numerical analysis, algorithmic complexity theory, computer science and processor architecture. The innovations are likely to be applied first for the benefit of users that are served by vertical suppliers, which integrate hardware + software + numerics + domain specific competence. Other businesses such as the process industries, which are served by a more complex network of vendors (process automation suppliers, simulator vendors, research centers specialized in numerics …), will see these innovations with a larger delay.

Take the “parallel-computing-for-the-masses” revolution we are living now: gaming consoles, netbooks, tablets and even smartphones have multi-core CPUs; and consumer software is quickly adapting to this switch: games, office software, compression utilities, multimedia applications…

But if we try to search for the keyword “multicore” in the websites of four mayor vendors in the arena of computing applied to the process industries, we wont’ find any match, no even to vague roadmaps, vaporware announcements or bold marketing presentations:




BTW right now I cant’ remember any academic contribution in this area, except one which modesty prevents me to cite here.

Even if the first non-embedded dual-core processor was released in 2001 and the first OpenMP Fortran API was released in 1997, this is the progress for the transition from single-core to multi-core CPUs in process simulation: 0% as of today.

The de-facto industry standard parallel computing architecture for general-purpose computing on graphics processing unit NIVIDIA CUDA was released with complete single-precision floating point support with version 2 around 2008. Guess when exploiting multi-core CPUs and GPUs at the same time will be a must in the  control rooms and on process engineers desks ?

Amazing: there is a lot of work to do !

pixelstats trackingpixel
Posted in Chemeng, Philosophy, Rants, Uncategorized | Leave a comment

Units of measurements switching coming to LIBPF 1.0 User Interface

After the multi-dimensional sensitivity, it was just about time for the User Interface for Process Flowsheeting to get the long-awaited usability boost: support for different units of measurements.

As of pre-alpha version 1.0.748 available on gitorious there is now a global option in the Settings menu, which is persistent on closing / reopening the program:

Also each field has a local option which is volatile (will be reset to the global option when the program is closed / reopened):

Finally the sensitivity analysis UI has been updated with units:

Of course the data storage to the database and all internal operations are in SI units.

pixelstats trackingpixel
Posted in UI | Leave a comment

Linux kvm guest freeze under Debian

If you run uniprocessor Linux guests using kvm virtualization under Linux Debian as host, you may hit the issue with kvm-clock clocksource causing freezes described here and here.

The issue is difficult to track down; search engines do not help with the most obvious query “kvm linux guest freeze”; the actual bugs cited above are filed with redhat bugzilla and not on any of the three separate bug trackers referenced by the kvm project (the kernel’s bug trackerQemu’s launchpad tracker and the obsolete sourceforge bug tracker). To complicate things further, there is a seemingly unrelated issue with SMP guests.

For all these reasons, I have put together here a thorough description of the symptoms and the cure.

The issue: the guests will freeze after some time (1 h – 48 h)  from boot, even if there is nothing going on in any of them. Launching more than one guest in a short sequence is more likely to make them all hang within 1 to 60 min. You can see the kvm process corresponding to the frozen guest takes 100% CPU on the host; they will not respond to ssh, ACPI shutdown/restart, the only way to get rid of them is to kill the process in the host. Even the libvirt daemon does not respond and will need to be force restarted, for example with this script:

service libvirt-bin stop
`ps aux | grep libvirt | grep -v kvm | grep -v grep | awk '{ print "kill -9 " $2; }'`
service libvirt-bin start

Host: The problem occurred on a single-CPU server with an Intel Xeon 2-core CPU and on a similar machine with a 4-core Intel Xeon, both with hardware virtualization support (VT-x) enabled.

Host OS: Linux Debian 6.0 Squeeze AMD64 with 2.6.32-5-amd64 kernel.

Host kvm version: The kvm package from Debian 6.0 Squeeze (version 0.12.5).

Guest OS: Linux Debian 5 Lenny to Debian 7 Wheezy, both x86 and AMD64. Windows guests do not suffer from this problem.

Qemu command line used to start the guests, something like:

/usr/bin/kvm -S -M pc-0.12 -cpu qemu32 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name deb6_32_dev -uuid ... -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/deb6_32_dev.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=utc -boot c -drive file=/dev/...,if=none,id=drive-virtio-disk0,boot=on,format=raw -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -device virtio-net-pci,vlan=0,id=net0,mac=...,bus=pci.0,addr=0x3 -net tap,fd=48,vlan=0,name=hostnet0 -chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -device usb-tablet,id=input0 -vnc 127.0.0.1:17 -k it -vga vmware -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

Troubleshooting tricks you could try (with no success): The problem does not go away if kvm is started with the -no-kvm-irqchip switch and it is also there if kvm is started with the -no-kvm switch (freezes only take longer to occur). There is nothing relevant in the libvirt logs, nothing on the guest console, nothing in the guest dmesg, no particular process active when the freeze occur. The strace output for the kvm processes always shows something like this as last system call:

futex(0x858560, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>

The solution: Set the guest clock source to acpi_pm rather than kvm-clock. This is best accomplished at boot via kernel parameters passed by grub. With grub2 (the default for Debian 6 Squeeze and 7 Wheezy), modify the /etc/default/grub file replacing this line:

GRUB_CMDLINE_LINUX=""

with:

GRUB_CMDLINE_LINUX="clocksource=acpi_pm"

Then run update-grub to update the actual configuration file /boot/grub/grub.cfg and reboot.

With legacy grub (the default with Debian 5 Lenny), edit the /boot/grub/, changing the following line in the Default Options section:

# kopt=root=/dev/hda1 ro

to:

# kopt=root=/dev/hda1 ro clocksource=acpi_pm

but do this without uncommenting it ! Then run update-grub and reboot.

After doing these changes, verify that the correct clock source is active in each guest, by issuing:

cat /sys/devices/system/clocksource/clocksource0/current_clocksource

you should get:

acpi_pm

Good luck !

pixelstats trackingpixel
Posted in Howtos | Tagged | 3 Comments

Heat transfer in falling-film boiling

The correlation of Kunz and Yerazunis (H. R. Kunz and S. Yerazunis, An Analysis of Film Condensation, Film Evaporation, and Single-Phase Heat Transfer for Liquid Prandtl Numbers From 0.001 to 10000, J. Heat Transfer / Volume 91 / Issue 3, 413-421, doi:10.1115/1.3580203 (?)), also cited in Perry, Chemical Engineers Handbook 7th ed. page 11-16 can be used to predict the heat transfer coefficient for falling-film evaporators.

The correlation is presented in the form of a graph, which makes its application in a computer system unpractical:

This graph can be digitized using the excellent utility ScanIt. The resulting digitized plot can be fitted using gnuplot to this expression:

a(x) = ((aa*x + ab)*x + ac)
b(x) = ((ba*x + bb)*x + bc)
c(x) = ((ca*x + cb)*x + cc)
d(x) = ((da*x + db)*x + dc)
e(x) = (ea*x + eb)
f(x) = (fa*x + fb)
g(x,y) = (a(y)*x+b(y))*(0.5-atan((x-e(y))*f(y))/3.14159265)+(c(y)*x+d(y))*(atan((x-e(y))*f(y))/3.14159265+0.5)

The expression g(x,y) returns the base-10 logarithm of the graph abscissa h / (K^3*rho^2*g/mu^2)^(1/3) as a function of:

  • 1 < x = log10(Re) < 5
  • 0 < y=log10(Pr) < 3.

The numerical values for the coefficients are:

aa -0.112754
ab 0.047577
ac -0.613718
ba 0.197783
bb -0.0809519
bc 0.614201
ca -0.00307804
cb 0.0140499
cc 0.290747
da 0.0255274
db 0.127306
dc -1.0811
ea -1.15207
eb 2.44218
fa 0.286347
fb 0.439599

The expression is smooth and qualitatively reproduces the original plot:

The deviation in term of graph abscissa are on average 1.2%, and less than 1% for more than half of the sampled points (370).

pixelstats trackingpixel
Posted in Chemeng, Howtos | Leave a comment

Project the map of the world and export it to a vector format

If you ever need to project the map of the world‘s political borders and export it to the vector format SVG (Scalable Vector Graphics), here is how to do it.

  1. look up your desired map projection, for example here (you will need the Proj4 string)
  2. get the World Borders Dataset from here; this is in the ESRI shapefile format (SHP) and in the WGS 1984 projection
  3. install prerequisites (as root):
    apt-get install proj
    cpan Geo::ShapeFile
    cpan SVG
    cpan Geo::Point
    cpan Geo::Proj4
  4. download shptosvg.pl (a command-line utility written in Perl to render projected maps in SVG format from GIS data in one or more shapefiles)
  5. execute shptosvg.pl (this example converts to the Gall-Peters projection):
    ./shptosvg.pl -x 12211 -y 8000 -S"+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs " -T"+proj=cea +lon_0=0 +lat_ts=45 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs" -p 8 TM_WORLD_BORDERS-0.2/TM_WORLD_BORDERS-0.2.shp > test.svg
  6. Edit the resulting SVG with Inkscape to fit your needs; example result:
pixelstats trackingpixel
Posted in Howtos | Leave a comment

A robust expression for the Log Mean Temperature Difference (LMTD)

The Log Mean Temperature Difference (LMTD) is a deceivingly simple expression to compute the average driving force in heat transfer:

LMTD = (ΔT1 - ΔT2) / log(ΔT1 / ΔT2);

The ΔT1 and ΔT2 are the temperature approaches, as in this figure:

Everything works well when the approaches are both non-zero, different in value and have the same sign, as in these two figures:

But when this expression is applied in practice, it may become singular whenever:

  1. The denominator equals zero, because the two temperature approaches are the same (this is a purely accidental singularity, because the mathematical limit of the above expression exists and is equal to the temperature approaches)
  2. The argument of the logarithm function is zero, because ΔT1 = 0
  3. The argument of the logarithm function is infinity, because ΔT2 = 0
  4. The argument of the logarithm function is negative, because the sign of ΔT1 and of ΔT2 are different, as is the next figure:

The fact that the LMTD expression screws up if any one of the temperature approaches is equal to zero is reasonable, because an infinite heat transfer area would be required. The fact that it screws up if the sign of the temperature approaches are different indicates that the LMTD is designed to calculate heat transfer in only one direction.
Singularities are often encountered if some iterative algorithm which encapsulates the heat transfer calculation enters a non-feasible region of the solution search space, and generates unphysical values for the temperature approaches.
In this case the expression above returns #NaN or #Inf and the errors may screw up completely the iterative algorithm, preventing it to ever get back in the feasible region and even less getting to the actual solution.

It would be therefore helpful to reformulate the LMTD expression so that it is more robust and gives reasonable if not correct results even for unfeasible input parameters. The modified LMTD expression should satisfy three requirements:

  1. to return the same result as the conventional expression when the approaches have reasonable values
  2. to not violate the second principle of thermodynamic, whereby no heat can be transferred from a colder body to a hotter body.
  3. to avoid sharp discontinuities that are harmful to numerical algorithms.

The method to stabilize the LMTD expression we have tested within LIBPF is based on a dead band, i.e. a range of small temperature approaches (lower in absolute value than a certain threshold). We assume that if anywhere along the heat exchanger we are getting a local temperature difference lower than the threshold in absolute value, we can assume that heat transfer is negligible in the affected fraction of the area.
So if one of the approaches is getting close to zero, it is clipped to the threshold value; furthermore the effective area is reduced by a factor, such that the part of the area where the local temperature difference is lower than the threshold does not contribute to the transfer; this reduction is computed assuming a linear temperature profile.

The method when applied to the case of sign reversal above is represented in this figure, where the dead band is painted in darker yellow color:The part of the area where the sign is opposite to the dominating heat transfer direction (light yellow region) is not considered at all, i.e. no heat is transferred back. In other words, the direction of the heat transferred is set by the sign of the approach larger in absolute value.

The resulting pseudo code is:

if (abs(ΔT1) <= deadBand) {
  if (abs(ΔT2) <= deadBand) {
    LMTD = (ΔT1 + ΔT2) / 2.0;
  } else {
    ΔT1clip = deadBand*ΔT2/abs(ΔT2); // absolute value is set to deadBand, sign is the same as ΔT2
    LMTD = ((ΔT1clip - ΔT2) / (ΔT1 - ΔT2)) * (ΔT1clip - ΔT2) / log(ΔT1clip / ΔT2);
  }
} else if (abs(ΔT2) <= deadBand) {
  ΔT2clip = deadBand*ΔT1/abs(ΔT1);
  LMTD = ((ΔT1 - ΔT2clip) / (ΔT1 - ΔT2)) * (ΔT1 - ΔT2clip) / log(ΔT1 / ΔT2clip);
} else if ((ΔT1 * ΔT2) <= 0.0) {
  if (abs(ΔT1) <= abs(ΔT2)) {
    ΔT1clip = deadBand*ΔT2/abs(ΔT2);
    LMTD = ((ΔT1clip - ΔT2) / (ΔT1 - ΔT2)) * (ΔT1clip - ΔT2) / log(ΔT1clip / ΔT2);
  } else {
    ΔT2clip = deadBand*ΔT1/abs(ΔT1);
    LMTD = ((ΔT1 - ΔT2clip) / (ΔT1 - ΔT2)) * (ΔT1 - ΔT2clip) / log(ΔT1 / ΔT2clip);
   }
 } else {
   LMTD = (ΔT1 - ΔT2) / log(ΔT1 / ΔT2);
 }
pixelstats trackingpixel
Posted in C++, Chemeng | Leave a comment

xps2pdf, xpstopdf

This article was written some time ago, in the meantime Okular in KDE4 does indeed support the XPS format, but sometimes you get sloppy performances, or you still need to convert files from XPS to PDF for some reasons.

Here are the instructions, updated for Debian Wheezy:

  1. download the “GhostXPS/GhostPDL 9.04 Source for all platforms”:
     wget http://downloads.ghostscript.com/public/ghostpdl-9.04.tar.gz
  2. extract and configure
    tar xf Scaricati/ghostpdl-9.04.tar.gz
    cd ghostpdl-9.04/
    ./configure
  3. tweak a small file
    echo '#define HAVE_SYS_TIME_H' >> xps/obj/gconfig_.h
  4. compile
    make xps
  5. convert your file
    cd xps/obj
    ./gxps -sDEVICE=pdfwrite -sOutputFile=test.pdf -dNOPAUSE test.xps

Enjoy !

pixelstats trackingpixel
Posted in Howtos | Leave a comment

Converting a two-pages-per-sheet PDF to one-page-per-sheet

If you got a PDF scan of a book with two pages per sheet such as this one (BTW this is a manuscript dating from 1449: “Tractato de septe peccati mortali” by Frate Antonino, image copyright Houghton Library, Harvard University, Cambridge, Mass.)

and you wish to convert it to a one-page-per-sheet PDF:

then you can use these steps on Debian Linux:

  1. Query the PDF for the number of pages and the resolution:
    pdfinfo ugly.pdf

    look at the “Pages” output of this command. Now type:

    pdftoppm -gray -l 1 ugly.pdf test

    then inspect the resulting test-001.pgm file with an image editor to find out the resolution; for the pages and the resolution I got 223 and 1650 x 1275 pts respectively, so these numbers will be used in the following – you should of course adapt them to your results.

  2. Create a bash script to process a single page:
    cat > doone.sh
    #!/bin/bash
    page=`printf '%03d' $1`
    pagenew=`printf '%03d_' $1`
    gs -sDEVICE=pdfwrite -dNOPAUSE -dQUIET -dBATCH -dFirstPage=$1 -dLastPage=$1 -sOutputFile="$page.pdf" ugly.pdf
    pdftoppm -gray "$page.pdf" > "$page.pgm"
    convert -crop 825x1275 "$page.pgm" "$pagenew.pgm"
    rm "$page.pgm"
    ^D
    chmod u+x doone.sh

    Note that for the X-resolution option of the convert command, I enter the half (625) of the horizontal resolution above (1250); in this way the pgm will be split in two vertically. The pdftoppm command has a -mono option to produce monochrome images, and a -r option to set the resolution.

  3. Run the bash script on all pages:
    seq 223 | xargs -n1 ./doone.sh
  4. Finally concatenate the pages to get hold of the converted PDF:
    convert *_.pgm nice.pdf

    or do that in two steps:

    for i in *.pgm; do convert -compress fax $i `basename $i .pbm`.pdf; done
    gs -q -sPAPERSIZE=a4 -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=nice.pdf *_.pdf
pixelstats trackingpixel
Posted in Howtos | Leave a comment

How to produce a Txy isothermal VLE diagram with LIBPF ?

Instantiate components:

components.addcomp(new purecomps::water); // 0
components.addcomp(new purecomps::EthyleneGlycol("MEG")); // 1
components.addcomp(new purecomps::DiethyleneGlycol("DEG")); // 2
components.addcomp(new purecomps::TriethyleneGlycol("TEG")); // 3

Instantiate and setup the stream object:

StreamNrtl2LiquidVapor s(Options(-1));
s.S("flowoption")->set("Nw");
s.S("flashoption")->set("PA");
s.Q("P")->set(101325.0, "Pa");
s.Q("Tphase.w", "H2O")->set(0.0);
s.Q("Tphase.w", "MEG")->set(0.0);
s.Q("Tphase.w", "DEG")->set(0.0);
s.Q("Tphase.w", "TEG")->set(0.0);

Loop over a composition range (in this case mass-based) and calculate dew and bubble point temperatures

for (Quantity x=Zero; x<=One; x += 0.01) {
  *s.Q("Tphase.w", "MEG") = x;
  *s.Q("Tphase.w", "DEG") = One - x;
  s.Q("Vphase.fraction")->set(Zero);
  s.calculate();
  Quantity TBP = *s.Q("T");
  s.Q("Vphase.fraction")->set(One);
  s.calculate();
  Quantity TDP = *s.Q("T");
  std::cout << x << " " << TBP.val() << " " << TDP.val() << std::endl;
}

What you get is a table of values; use your preferred plotting tool to create a plot like this one:

 

pixelstats trackingpixel
Posted in C++, Chemeng | Leave a comment

LIBPF builds with clang++

Clang++ is a new generation C++ compiler, based on a complete redesign. According to the reports it can not yet match compilation and execution time of the more mature competitors (g++, msvc, intel…), but it is growing up really quickly: the successful compilation of the boost libraries has been announced about one year ago. It was definitely time to give it a try with the LIBPF library, which we did on Debian testing (sid).

Compiling LIBPF on Debian sid, by the way, is much easier than what it used to be because over the years all the dependencies have been packaged (previously there was no libloki-dev and libsilo-dev package available). What is required nowadays to compile the LIBPF library with the default compiler on Linux (g++) is just:

apt-get install git g++ libboost1.46-dev xsltproc libloki-dev libpq-dev libsqlite3-dev libgmm++-dev libiodbc2-dev libsundials-serial-dev libmxml-dev libsilo-dev libsuitesparse-dev
git clone ...
cd LIBPF/src
bjam

To compile with clang version 2.9-7 you need to issue the following two commands:

apt-get install clang
bjam toolset=clang

Treating your (supposedly) standard-complying C++ code with a new compiler always brings in some new insight. Clang++ benefits here are the precise error / warning messages and the stricter syntax parser. The main lesson learned thanks to clang was that the right way to define a static member and instantiate a template class is:

template<> std::string PhaseIdeal<PhaseType::vapor>::type_("PhaseIdeal<vapor>");
template<> std::string PhaseIdeal<PhaseType::liquid>::type_("PhaseIdeal<liquid>");
template<> std::string PhaseIdeal<PhaseType::solid>::type_("PhaseIdeal<solid>");
template class PhaseIdeal<PhaseType::vapor>;
template class PhaseIdeal<PhaseType::liquid>;
template class PhaseIdeal<PhaseType::solid>;

and not the other way around (which is tolerated by all other compilers):

template class PhaseIdeal<PhaseType::vapor>;
template class PhaseIdeal<PhaseType::liquid>;
template class PhaseIdeal<PhaseType::solid>;
template<> std::string PhaseIdeal<PhaseType::vapor>::type_("PhaseIdeal<vapor>");
template<> std::string PhaseIdeal<PhaseType::liquid>::type_("PhaseIdeal<liquid>");
template<> std::string PhaseIdeal<PhaseType::solid>::type_("PhaseIdeal<solid>");

And now some performance data, obtained on a somewhat outdated, single-core workstation:

Performance criterion g++ 4.6.1 clang++ 2.9-7
debug release debug release
Compilation time 6m27.550s 10m26.199s 7m26.881s 10m56.789s
Executable size (Mbytes) 24.1 7.6 42.1 10.1
Execution time - 10m10.655s - 11m30.840s

Clang++ has still some way to go before entering the A league, but it is already today a valuable code compliance tool.

pixelstats trackingpixel
Posted in C++ | Leave a comment