Christof Monz has generously provided this machine translation of
the U Twente ASR output.

In the development data, 4 files did not have speech:

    bg_4413
    bg_8900
    bg_8938
    bg_8947

In the test data, another 4 files have no speech:

    bg_10241.asr.xml
    bg_16343.asr.xml
    bg_26788.asr.xml
    bg_36690.asr.xml 

In terms of segmentation, everything that is under a speech label is
taken as a sequence/segment. This is also preserved in the XML MT
output file.  Obviously,pauses were not considered. Also, word
duplicates in sequence have been removed. The MT did not use the
ASR lattice information.