Christof Monz has generously provided this machine translation of the U Twente ASR output. In the development data, 4 files did not have speech: bg_4413 bg_8900 bg_8938 bg_8947 In the test data, another 4 files have no speech: bg_10241.asr.xml bg_16343.asr.xml bg_26788.asr.xml bg_36690.asr.xml In terms of segmentation, everything that is under a speech label is taken as a sequence/segment. This is also preserved in the XML MT output file. Obviously,pauses were not considered. Also, word duplicates in sequence have been removed. The MT did not use the ASR lattice information.