Valid HTML 4.0!

Table of Contents   Advanced Topics   Deal Database Forum

Handling Audio

Audio Holes

Video streams are far from perfect. Indeed, errors in both the audio and video portion of the signal are quite common.  Compressing the audio and video is usually done using "lossy" techniques such as MPEG which deliberately allow some of the original signal data to be lost in the interest of packing the stream into the smallest effective amount of space.  Errors of this sort are occasionally problematical, but since they represent source errors, there really isn't much to be done about them.  Occasionally, however, an entire Presentation Unit will be lost or corrupted.  A Presentation Unit is the MPEG term for one packet of either video or audio as the case may be in the packetized stream. In a TiVo this sort of error is most frequently seen in a satellite (DTiVo) source where a PU or even a group of PUs is lost or corrupted.  Once again if the hole is in the video stream, it will be very briefly visible on the screen, but there isn't really much to be done about it.  The loss of an audio PU or two is ordinarily completely unnoticeable.

These events aren't really common but they do happen, meaning a single stream won't have hundreds of them, but a really bad stream might have 10 holes. In a single 3-minute clip the author found 4 holes consisting of 4 audio PUs. Allowing 24ms for each PU means a loss of 96ms at each hole.  When spaced through the media in this fashion, as said before, they are completely unnoticeable, but if the user is splitting the audio and video streams, the errors add up, and before long it can become impossible to keep the two synchronized.

Splitting is the process of taking a source stream and outputting 2 separate Elemental Stream (ES) files: one for audio and one for video.  The key idea to remember in this regard is that there is no timing information between the two: the synchronization information is completely lost when splitting.

With no holes a video stream looks like this:

Video 1

Video 2

Video 3

Video 4

Video 5

Video 6

Video 7

Video 8

Video 9

Video 10

Audio 1

Audio 2

Audio 3

Audio 4

Audio 5

Audio 6

Audio 7

Audio 8

Audio 9

Audio 10

With audio holes, however, it looks like this:

Video 1

Video 2

Video 3

Video 4

Video 5

Video 6

Video 7

Video 8

Video 9

Video 10

Audio 1

Audio 3

Audio 4

Audio 7

Audio 9

Audio 10

If the stream is maintained as a single entity, this presents no problem because the holes are inaudible, and the sync tags telling the decoder when to play each audio frame are still embedded in the stream.  Each audio frame plays at the correct instant with repect to its sister video frame, whether any predecessors have been lost or not.  If the audio and video in such a stream are split into separate streams, the following is the result:

Video 1

Video 2

Video 3

Video 4

Video 5

Video 6

Video 7

Video 8

Video 9

Video 10

Audio 1

Audio 3

Audio 4

Audio 7

Audio 9

Audio 10

Audio 11

Audio 12

Audio 13

Audio 14

Clearly any attempt to match up the two independant streams at a later time is doomed to failure.  If the user wishes to split a stream of any great length, edit the independant streams, and patch them back together, he may encounter sync problems.  If this happens, the user can attempt to rectify the problem by setting the Patch Audio Holes option.

If Patch Audio Holes is set, TyTool will attempt to replace each lost or damaged audio PU with a block of silence.  This will have the effect of keeping the streams much closer to the same length without impacting the perceived audio quality.

So why not just turn the option on and leave it?

Although the impact on the processing time is extremely small for setting this option, the bad news is in the past there were circumstances where using it would destroy the output stream. It appeared to be a case where for some odd reason less than 1 full packet was needed and then it would generate literally thousands of filling packets.  An attempt was made to fix the issue, but no actual stream was submitted from anyone that did this. Thus, it is not certain it is fixed at all.  The bottom line is the user should leave the option disabled unless a problem is encounterd with sync after splitting a stream, in which case turning the option on may help.

Transcoding

(Written by Josh Dinerstein & edited by Leslie Rhorer)

1.0 Introduction

Ok. Some basics to start off with. TiVos record most things in MPEG audio format. The notable exception here being some Movies and specials that are done in various Dolby Audio formats.

YOU CAN NOT CHANGE THE BIT-RATE OR SAMPLE-RATE OF AN MPEG AUDIO DATA STREAM.  "But how can that be?" you ask.  After all, the other tools do it... :)

The answer is simple. MPEG-1 Layer II audio which both the SA and DTivo units use is a compressed data stream.  You cannot alter a compressed stream.  It would be like trying to change a few bits in a .ZIP file and expecting to get a different and yet correct set of files when you uncompress the archive. 'Just not going to happen. Or in a non-technical way it is like putting the a box of one size inside of a larger box and trying to claim it is the same thing.

So how is it done? The process becomes:

- Decompress the Stream as it was recorded by the Tivo. This results in the raw PCM (Pulse Code Modulation) data. This is the same thing as a .WAV file on the PC.  Uncompressed this data is HUGE.

- Change the bit-rate and sample rate, as desired, on the PCM raw data. Now that we have the raw data we can indeed change these 2 piece of information. I see no point in getting into the nitty gritty detail of how it was done but trust me that it can be done successfully at this data layer.

- Re-encode the PCM data into a brand new stream of Mpeg-1 Layer II audio packets. This produces an all new Elementary stream for the audio. The size of the packets will be different from before. The playing time of each packet will be different from before. Etc... It is truly all new.

1.1 How this has been done in the past...

In the recent past many people have posted methods using other tools to do this. My own involved using WinAmp, VirtualDub, and too-lame. 'Worked. 'Worked every time. But it was tedious and made editing especially in a non-perfect stream down right impossible.

Others have used BeSweet. Which gave a GUI and a 1 stop-shopping setup for doing this. It worked. And often sounded great. But it required again that there be no editing etc...

Fundamentally this has rather been beaten to death.  If you have to split the files you lose the data that makes it possible for me to work around the holes and other issues that can and do often occur.

1.2 How this is being done now in TyTool and VSplit...

I saw no point in re-inventing the wheel. We have ssrc which changes things in a great way. We have toolame and others like it for encoding. We have a number of open source decoders. So a long long time ago I started to gather these things and made a very ugly tool for doing this kind of thing. It worked for only 1 test file that I had. That is how targeted it was to what was I was testing and doing before.

I was then contacted by another software author on the forum.  He offered to write a newer tool setup for me and he did. I was really slow in getting to it, however, as there were other things I was working on at the time. (Basically I was trying to perfect the splitting and error detection.) He then went onto join the TyStudio team and put up his work as the TyTranscode program in that product set. Trust me I don't blame him at all.  I think it is great that people have been able to use it for so long.

When I decided to add this feature I got in contact with him about using his stuff.   Again, why re-invent the wheel?  After all, Rowan had it all working... :) I just wanted to make sure that after some of the bad blood that had happened here on the forum he was OK with my using it.  He was, and we have collaborated on what is being used now.

So just to be clear the transcoding features in TyTool and VSplit are in fact The TyTranscode functionality with a few changes.

1.3 What was changed?

Primarily I made it into a DLL . This was done for many reasons:

1- It is large! Larger in fact that either the VSplit and TyTool tools all on it's own. I have not looked at the source carefully enough to know why, so I have no clear picture as to whether this size is truly needed or a by-product of things else that are going on.

2- It is my understanding that it was written using open source code. Not a bad thing. But I am heartily sick of the arguments about that whole concept.  So I made it a library.

3- I wanted to make it so that it could be changed out later if needed. A different library with the same interface could be used instead. Kind of a plug-in architecture. At least that is how it started. I am not sure that I made it generic enough. But hey it is a first cut at the idea.

4- I changed it from working on files to expecting 1 audio packet in and getting them back out as result of each written packet.  'Just trust me on this one.  Given how I have been building things this was the way to go.

5- I created all of the DLL portions of what is going on.

Now there are a few really important things for the user to consider here or at least of which to make note:

1- Normally when a Win32 program uses a DLL it is required to be there. I.e. the program will not run if you don't have it. I expect that just about all of you have seen this happen at least once. This occurs when you do a static binding at link time.

I CHOSE NOT TO HANDLE IT THIS WAY!!! I did a manual dynamic binding within the code itself. Thus the DLL is loaded only when a transcoding action is specifically required. So for those that want to go from a DTivo stream right to DVD, like me, you do not need to and in fact should not be transcoding the audio. You will not need the DLL at all and can just make use of the EXE itself.

For those that want to use the transcoding to go from an SA Tivo to DVD for instance you will need the DLL. At the start of a VOB mux session you will see it load the DLL and work will start. You will also see that is slower if you do this. Audio encoding takes time. There is no way around that.

Thus for those that don't want to transcode there is no need to copy the DLL around with the EXE. Just take the EXE and you will have no problems.

This might seem like a bit of over kill. But I have no use for the transcoding and I hate having to shift around things that I do not need just so the program will run. And hey. I wrote it so I get to make decisions like that... It is one of the perks of being the author... :)

2- I will say it again to be clear. Audio decoding and encoding takes time. For those with SA Tivo streams be prepared for it to go slower. Again the impact is not extreme. But it is there. Be patient. I think you will find the output well worth the wait.

3- I architected the big rewrite a while back with this kind of thing in mind. As a result all of the previous features of VSplit and TyTool work correctly. So you can still make key files, edit them, and mux, split-mux, SVCD mux, and VOB mux all with the cuts with no problems. And of course for the coup de gras string Splitting can be done with Transcoded audio as well for using with any other tool people want to use it with (again with cutting place just like the last release... :)The output will simply have a new audio stream.

4- Because it is part of the process. Any holes are correctly handled including the resetting PTS values that are such a freaking pain to deal with. So the thing will be in sync, and stay that way throughout the duration of the show. I have tested this and it works for both DVDs and SVCDs with holes and with resets on my hardware players.

2.0 How to Transcode using VSplit:

'Simple.  There is a new command line argument. Here is the new usage page:

options:

-v verbose

-V Very verbose (DTivo additions)

-j# Jump # number of chunks into the file before processing. (Quicker checks of fixes...)

-l# Process # number of total chunks (once processing starts).

-n No output. Process as normal but don't save anything to disk.

-m Multiplex output. Results in a single MPEG-2 Program Stream. (Outputs in VideoFile, audiofile is needed but ignored).

-d Multiplex into (S)VCD sized PACKS (2324 bytes).

-k Build the edit KeyFrames file.

-c* Multiplex using the cut list file '*'.

-p Produce separate Multiplexed files using the cut list file '*'.

-f Fix the frame order at a cut for Ulead tools (use with -p mode)!

-b Generate a VOB format mpeg file.

-a Perform audio 'hole' patching.

-t# Perform audio transcoding. Options (1-6 see below):

(s)VCD @ 44,100 -> (1 = 160, 2 = 192, 3 = 224kpbs)

DVD @ 48,000 -> (4 = 160, 5 = 192, 6 = 224kpbs)

-h help

-t# is what is new. T is for Transcode. The # corresponds to 1-6 for the various output modes. That is it. Everything else remains the same and everything that is new is automatic from that point on. Easy no? :)

In testing these out I discovered that mode #1 is bad. The output just doesn't work. I have no idea why. The fix will have to come from the author of the TyTranscode stuff itself. As a result if you pick that mode it will automatically roll-down to mode #2. Which will still work for SVCDs.

Keep in mind: If you have a DTivo and are burning to DVD you DO NOT NEED TO TRANSCODE.  If your source is an SA TiVo or the destination is an SVCD you must transcode the audio, however.

Here are the SVCD audio output modes:

Mode #1 - SVCD compatible. 44,100 sample rate @ 160 kpbs.

Mode #2 - SVCD compatible. 44,100 sample rate @ 192 kpbs.

Mode #3 - SVCD compatible. 44,100 sample rate @ 224 kpbs.

The 224 was added for completeness sake. But to be honest all it really does is make the audio output larger. Since fitting as much as possible on the disk is what people want why would you want it to be larger?!!?

So I recommend mode 2 for people that are going to SVCD.

Here are the DVD audio output modes:

Mode #4 - DVD compatible. 48,000 sample rate @ 160 kpbs.

Mode #5 - DVD compatible. 48,000 sample rate @ 192 kpbs.

Mode #6 - DVD compatible. 48,000 sample rate @ 224 kpbs.

As the SA TiVo is by default audio at 32,000 @ 192, I recommend mode #5 for transcoding to go to DVD. Thus the bit-rate remains the same but the sample rate is changed to what is needed for DVD.

Again I have tested both modes and they work. I took some SA TiVo streams to DVD with perfect sync, with cuts so there were no commercials. :)

2.1 How to Transcode using TyTool:

Even easier. There is a setting for this in the Options menu. Under Audio->Transcode Options just pick the mode you want. Just like everything else in the tool this setting will get saved into the .ini file as NONE by default. Once you pick one from there on out you will have it every time you start TyTool. I figured those with SA TiVos would want it that way rather than picking it every time.

For those with 2 types of TiVos Create 2 TyTool directories. Since the .ini files are local to the directory TyTool was run from you can easily have both.  Create desktop shortcuts for TyTool in each directory renamed to make the clear of course and set 1 to transcode and one not.

3.0 SVCD notes:

Alright now for the bad news. I found that several SVCDs which I created did not work in two of my DVD players. I thought I had done something wrong. So I checked the mux and it was perfect. Ouch. That was bad news at the time.

So I tried to mux the files as they were out to SVCD format using other tools. MPLEX from the mjpeg tool suite. And it reported that the data rate was too high and that frames were being dropped. So it is a matter of the forced mux rate for an SVCD being so low by spec.

Then I tried bbmpeg and got the same exact results.

My code again mux'es my way and so the file will indeed play on the PC it is just by is very nature not in spec for SVCD. There is nothing short of video re-encoding that can be done about this.

Now for the good news: what is produced is firmly within the category of an xVCD. Which is basically just a high tolerance with in the DVD player for out of spec disks. My APEX 600a played all of the test SVCDs perfectly. Again that thing will play just about anything up to and including frisbees. :) (Except for bad audio rates. When I tried that it failed miserably. With 44,100 audio it works for everything I have tried.)

I met with similar success on other players: Apex 1100, 1100w, 1200, 1201, Mintek 1600.

So basically if you have a player that will take xVCD output you will be fine. If not, I can't eve guess at your results.

4.0 DVD notes:

It works. What more can I say? Since the video data is well below DVD spec we have no mux'ing problems in that way (like the SVCD output). So we are good.

With the correctly audio data rate you get exactly what you were all looking for. Perfect playback on the VOB itself.


Table of Contents   Advanced Topics   Deal Database Forum