Older blog entries for rudybrian (starting at number 26)

Last Saturday's Phase IV test went pretty much as planned. We didn't have any collisions or close calls during the run. During the test another demo activity converged in the 'atrium area' on the lower level of the museum blocking Zaza's only route to her next goal. The large number of people and the unmapped obstacles currently in the area from the installation of the 'Play' exhibit in the temporary exhibit space prevented Zaza's localizer from getting a revised position for an extended period of time. She began searching for an obstacle that looked familiar but the visitors were so engaged in the demo that they would not let her pass. The 'reaction' module was running, and the robot began verbalizing her dislike of being blocked to the demo audience, to the dismay of the presenter ;) To keep Zaza from further interfering with the demo, we manually joysticked her out of the area. The remainder of the test was fairly uneventful. I was finally able to get the high-level people detection code working about half-way through the run, and we used it for the remainder of the test. If it can do a better job of detecting people, I'll write a new version of reaction to support it for Phase III and IV operation modes.

I made a few architectural improvements to the voice/face system over the last few days. The voiceServer now maintains a 'stack' of the last n cues in shared memory to provide slower asynchronous clients a loss-free way to get speech cue data. This should eliminate the possibility of loosing cues that are sent too quickly to be spoken in real-time. The change required updating the clients and applet, but it was worth it.

Just for grins, I ran the code I have written since the aquiring the robot through SLOCCount, quite amusing this obsession is ;)

The Tech Museum's web folks now have the 'official' Zaza website online.

Huzzah!

I finally had a chance to update the Zaza Info and Technical sections on my site this week. Quite a bit has happened in the last five months!

I'm looking forward to tomorrow's Phase IV test and hoping everything goes smoothly ;)

Made a fair amount of progress today.

I solved a fairly serious problem where Zaza would occasionally crash into obstacles after visitors 'herded' her into them. It ends up we weren't using a critical component of the high-level collision avoidance code, and hadn't noticed until now. Oops ;)

I fixed a bug in the voiceClient Perl code that caused the Java face applet to get 'shy' and stop talking when it received a bad speech request from a client.

Last week I submitted a few pages I have been working on for the Tech's website about Zaza. Hopefully the Tech's webfolks will have it up in a week or so.

The FFmpeg team is nearly ready to release another official version that will fix some of the problems with live streaming. It's been nearly nine months since the last release, so it's been a long time in coming. This will allow me to replace the JPEG-based 'webcam' scripts and applet/ActiveX control with a true video streaming system using MPEG4/H263 compression and MP3 audio.

23 Apr 2002 (updated 23 Apr 2002 at 21:15 UTC) »

The Zaza project is picking up momentum.

After three months of software development work I was finally able to switch Zaza's second onboard computer over to Linux. There were some quirks to getting the machine working properly though. The SMP kernel/hardware combination had some trouble with two of the network cards we tried, but I eventially found one that worked properly. We couldn't get the composite NTSC video out port on the video card to sync when running under X, so the addition of a dedicated external VGA to NTSC converter was required. Last Saturday we did the first public run with the new face/voice setup made possible by the upgrade. Things went pretty well all considered.

We have been doing software tests of the Phase IV code on Saturday afternoons after the regular public demo. Slow progress is being made on the 'tour' behavior code, but we should have something usable by mid-summer.

Lots of updates on Zaza's code development since last month:

I think I finally resolved the bug in the zazacam applet that caused it to slowly consume virtual memory. Apparently Java likes to cache images in memory indefinitely if the getImage() function is used. Garbage collection and flush()ing manually don't seem to help.

I fixed a bug accidentally introduced into zazamap applet during the re-write for 0.60 that prevented the graphics canvas from repainting automatically after a position update. The applet still consumes far too much of the CPU in auto-track and higher zoom modes, so I still have some work to do.

As mentioned in the last entry, the zazaface applet now works with the faceServer and performs fairly well. I expected both the scripts and waveforms to be cached on the client-side, but it appears that only the waveform is. Performance is still acceptible. I also updated the viseme image handling routine to auto-scale to the graphics canvas. I still need to tweak the sections of the code related to running in application mode, and correct the polling delay so that is is consistant between platforms.

Before yesterday's Phase IV test run I found and fixed one longstanding bug in poslibtcx's goal arrival code that caused unpredictable behavior. For some reason GCC isn't issuing warnings about variables that are declared twice ;)

I started researching possibilities for the new video distribution system that switching zaza2 to Linux enables. MPEG4IP has some great tools for live streaming, but until the licencing fee issue is resolved, and MPEG4 clients are standardized it isn't an option. FFMpeg worked quite well in tests with an earlier version, FFServer in the current version is broken, so we won't be able to use it either. I guess we are still stuck with MJPEG until a better option becomes available.

20 Feb 2002 (updated 20 Feb 2002 at 22:35 UTC) »

Last week I finished a reference implimentation of Zaza's new face/voice server in Perl. I debated using either Perl or Java for the server-side components as each has aspects that could be of some use. I finally settled on Perl because it would take less time to put something together that would be stable.

The 'faceServer' application is designed to be scalable to the available computing resources both onboard and offboard the robot. It connects to a Festival server by use of the Festival::Client::Async module. Since Festival's rendering of speech can be rather slow, the Festival server can be located on a higher-speed offboard computer. Speech 'cues' are cached on the local file system for better performance with pre-rendered cues. The Java face applet is notified when new cues have been sent to the server by use of cue ID numbers (10-digit CRC32s of the cue string). The client downloads the waveform and 'script' data and begins playback as soon as it has both files. Since the filenames of each cue are unique, they are cached by the applet so no download is required if the cue has already been 'performed' by the applet.

I still need to spend some time with the applet, but expect to have it operational by Friday afternoon.

As always, the project code can be found here.

The Tech Museum is hosting an 'Engineering WOW Weekend' this Saturday and Sunday. Folks from the HBRC and SFRSA will be there showing off their robotic creations. Zaza will have her first chance to interact with another robot since May of last year. It should be fun ;)

I cleaned up Zaza's goal arrival code a bit on the morning of the 15th and ran for most of the afternoon until her batteries ran out. The new code correctly identifies goal arrival, and initiates a verbal announcement of this.

On the 18th I added a few new 'virtual' security barriers to the planners map of the lower level of the museum which should help to prevent collision with a few 'invisible' obstacles in the Explorations gallery.

Work on the new control interface continues. I completed the dual-machine process monitoring Perl CGI, map applet hooks and CSS frame definitions last Friday. The new startup/shutdown CGI will take some time, but I should be able to complete it this Friday.

Over the last five months I have been searching for a way to replace Zaza's Windows-based face application with something that would run under Linux. Zaza2, the second computer onboard Zaza, has been getting progressively less reliable, and has hamstrung our ability to enhance Zaza's 'personality' by adding voice recognition, face tracking, or other vision applications. I tried Wine and several 'virtual machine' applications to run the application as-is. Unfortunately, either the sound output was horrible, or rendering of the mouth was done too slowly. I ran across a project attempting to synchronize the output from Festival with Ken Perlin's Java face. It doesn't appear that he was sucessfull, but gave me a few ideas. On the 16th I began working on a proof-of-concept to see if I could synchronize the output from Festival with the 13 'visemes' used by the MS SAPI SDK's 'talking microphone' with a Java applet/application. By the evening of the 17th I had a working demo. The applet uses a 'phoneme script' and WAV file produced by Festival to synchronize the audio playback with the appropriate 'viseme' for the phoneme being spoken. The results are encouraging, and prove the viability of this approach. My only lingering concern is performance. The architecture of the new face system will need to address this.

I finally had a chance to clean up the map image rendering routines in Zaza's map applet yesterday. Rather than rendering the map directly to the on-screen canvas after each position update, the canvas is double-buffered for a small speed increase. This should help to localize the source of the reported memory leak in one of the applets.

I spent last Friday working on automating the startup of the Phase III/IV applications and servers. Since two machines are involved, and the startup of some of the servers can be quite lengthy I will be moving the CGI back-end of the control interface to the offboard server. Fortunately this machine has a modern Perl version, so I can clean up the Apache CGI interface.

poslibtcx still has some bugs relating to arrival at goals. Occasionally goal arrival is not registered, and the planner gets out of sync with poslib. I need to update the 'reached goal' routine to correct this, and only allow a single execution at arrival. After these issued have been stabilized, I will stub-in routines, to stop, turn to the goal (or audience) and deliver an informational monolog.

Thanks to support from TRCY club members, we did the first public web interface test with Zaza on December 22. We ran into a few minor problems with the 'poslibtcx' Perl-C interface layer when there are multiple exhibit destinations in the planner. One of the other club members noticed a memory leak in one or both of the applets during the run. I was able to correct the poslibtcx bug, and the following weekend's test went off without a problem.

The Museum's firewall appears to be having some difficulty, and at the moment external access to the video and web interface is not available, but should be again by Friday afternoon.

After several months of neglect, I finally had a chance to update each section of the main Zaza site last week.

Thanks to the IT folks at the Tech, Zaza's webcam has one less layer of firewall to penetrate so I have been able to boost the speed a bit. The live video page has been updated to allow users of IE access to a slightly faster stream thanks to an ActiveX control that provides server-push support. For some reason my perl CGI isn't outputting MJPEG's that Netscape and Mozilla like, so for the moment the speed increase can only be enjoyed by IE users. I am planning to spend some working on the problem this Friday before the following day's run.

17 older entries...

X
Share this page