|
Feb 20 2009
|
After some head scratching today I have managed to determine that the new PID which is used to get the rtmp stream URLs is generated by the new, long PID in the format of ‘[64chars]~[64 chars]’ (as shown when get_iplayer runs in –verbose mode). The correct PID is generated by using some keys (in a var called ‘copyrighted_strings’) and AES crypto functions from this flash file: http://www.hulu.com/sec.swf (specifically function ‘dec’). If anyone feels like writing/porting this in perl please let me know.
I just wrote a wrapper using haxe and executed their sec.swf using gnash. It’s clunky and has too many moving parts, but it works… I had to compile gnash with –enable-extensions=FILEIO to allow the script to write to stdout.
Take a look at http://www.highlandsun.com/hyc/huludif.txt for the source code.
Hi Howard, Wow! you have been busy, I’ve been away this weekend so sorry for the lack of response. For reasons of practicality, we really do need to find a way to do this without having to recompile gnash. I do like the idea of downloading the swf from hulu on-the-fly and using it – I was trying to figure out a good way of doing this with the least dependencies. Ideally we could just use the versions distributed by the usual linux distros. Is swfdec any better in this case? I’m not sure what compile-time flags are used with gnash in Fedora/Ubuntu etc. FOr legal reasons, it would definitely be better not to distribute the keys of course. I am almost certain that the dec function was AES because to find out I just googles parts of their functions in actionscript and found some OpenSSL AES C source code which was almost entirely identical – probably copied from there I suspect! One thing I would like to do is to remove the dependency for bash/sh – quite easy to redo that bit in perl of course. In this way we can maintain better platform independence.
Right, you can invoke gnash directly from the perl script, just need to chomp($hpid) afterward to kill the trailing newline.
Also adding “-1″ to the gnash invocation makes sure that gnash exits after the test.swf has executed.
Unfortunately, the distros tend to compile gnash without support for extensions, and being able to write to stdout is not a builtin feature of flash/actionscript. I haven’t tried the swfdec library yet but I’ll see if we can make it work the same way.
I played around with the OpenSSL aes-256 crypto functions but couldn’t come up with a match for the script’s output. I think that may have mostly been due to me not knowing the right initial value to use. So I gave up and went this route. Which is more effective anyway, since they’ve now changed sec.swf yet again – they can keep changing it every day and we’d have to keep changing our code to follow. Might as well just execute theirs and be done with it.
It took some digging but I figured out how to call into the swf using swfdec as well. It’s not supported by the public APIs, so not sure how viable this approach will be in the long run either. But fyi,
http://highlandsun.com/hyc/decswf.c
That’s quite neat. Still has the dependency problem though. Just a thought; can gnash write to a tmp file out-of-the-box? If so we could use this instead of stdout maybe?
No, without extensions enabled, gnash is completely sandboxed.
I’ve emailed the gnash developers list, asking for some kind of stdio to be enabled by default. Likewise I’ve emailed the swfdec authors asking for an officially supported API. We’ll see what happens…
No, not by default. Since Flash/Actionscript are designed to run inside a web browser, they’re normally sandboxed to prevent them from accessing anything on the real machine. An gnash extension is the only way to get outside the sandbox. I’ve emailed the gnash and the swfdec developers about all of this, hopefully they’ll provide a standardized call to support stdio by default, but even that won’t happen till some distant future release.
I’ve added the new option –hulu-decrypt-pid so that you can specify an arbitrary script to decrypt the pid. The script, when called as ‘myscript [encrypted-pid]’ should output [decrypted-pid] to stdout.
hi
i am also trying to fix the hulu issue and feel that hulu encryption cant stand for long.
i was testing howard chu method of gnash and haxe.
i downloaded and compiled gnash with –enable-extensions=FILEIO
later i complied the haxe compiler and then created Test.swf from compile.hxml
i ran get_iplayer but it couldnot get the correct HPID whch has some garbage from gnash. btw i also edited dec.sh and added -1 at the end coz the flash never QUIT !!!!!
any idea where i could have gone wrong ???
pasted the code below
TODO:
ill find a better method to bypass the haxe stuff
atleast get this encryption bypassed (though downloading the swf and then decrypting is a better solution in the long run)
./get_iplayer –type=hulu –pid http://www.hulu.com/watch/57855/the-simpsons-take-my-life-please –rtmpdump ‘/root/rtmpdump’ –verbose
get_iplayer v1.41, Copyright (C) 2009 Phil Lewis
This program comes with ABSOLUTELY NO WARRANTY; for details use –warranty.
This is free software, and you are welcome to redistribute it under certain
conditions; use –conditions for details.
INFO: User prefs dir: /root/.get_iplayer
INFO: System options dir: /etc/get_iplayer/options
Current options:
pid = http://www.hulu.com/watch/57855/the-simpsons-take-my-life-please
rtmpdump = /root/rtmpdump
type = hulu
verbose = 1
INFO: Search args: ”
INFO: Will try prog types: hulu
WARNING: Cannot read /root/.get_iplayer/download_history
INFO Trying to download pid using type hulu
INFO: pid not found in hulu cache
INFO: Cleaning pid Old: ‘http://www.hulu.com/watch/57855/the-simpsons-take-my-life-please’, New: ‘hulu-57855′
WARNING: Cannot read /root/.get_iplayer/download_history
INFO: flashnormal,flashlow modes will be tried
INFO: Trying flashnormal mode to download hulu: –
WARNING: ffmpeg does not exist – not converting flv file
INFO: Getting page http://www.hulu.com/watch/57855
.INFO: CID=14195191
INFO: Getting page http://r.hulu.com/videos?content_id=14195191
.INFO: HPID=69c5a4fdcbbe11d42613d10adb47f6d95d534d637340dca5c9cf0f4d07d4a56e~cbad119678449fa825ba83de055e48ddd9590075a7f971dbf7636f63391afa7b
INFO: HPID=RcInitFile: parsing /usr/local/etc/gnashrc
INFO: Getting page http://releasegeo.hulu.com/content.select?pid=RcInitFile: parsing /usr/local/etc/gnashrc&mbr=true&format=smil
WARNING: Failed to get programme stream data for from site
WARNING: Failed to get programme stream data for from site
WARNING: Failed to get programme stream data for from site
WARNING: No programme stream URLs were found for flashnormal, skipping
DEBUG: Download using flashnormal mode return code: ‘next’
INFO: skipping flashnormal mode
INFO: Trying flashlow mode to download hulu: –
WARNING: ffmpeg does not exist – not converting flv file
INFO: Getting page http://releasegeo.hulu.com/content.select?pid=RcInitFile: parsing /usr/local/etc/gnashrc&mbr=true&format=smil
WARNING: Failed to get programme stream data for from site
WARNING: Failed to get programme stream data for from site
WARNING: Failed to get programme stream data for from site
WARNING: No programme stream URLs were found for flashlow, skipping
DEBUG: Download using flashlow mode return code: ‘next’
INFO: skipping flashlow mode
ERROR: Failed to download ‘ – (hulu-57855)’
Best use the latest version of get_iplayer with the –hulu-decrypt-pid option. Then specify another script which will return the decrypted hulu pid given the encrypted one as the first argument. Make sure the stdout only contains the new decrypted hpid.
I too tried Howard Chu’s gnash/haxe method…and obtained the same result you did: gnash spit out “RcInitFile: parsing…”, not the decrypted PID.
On the other hand, Howard’s decswf.c worked like a champ for me. When compliled and embedded in a script supplied to get_iplayer via –hulu-decrypt-pid, it worked just fine.
Two other things I noticed:
1) some hulu URLs don’t find PIDs when requested via –pid. However, if the index number is obtained via the index file and then supplied via –get, get_iplayer obtains the PID OK and can get the file (subject to #2 below).
2) a number of hulu files are rtmpe/rtmpte encoded, which rtmpdump can’t handle. Any idea if/how this can be gotten around?
1) Can you give me some examples where this happens – I’ll see if I can debug it.
2) None that I know of yet.
I was about to give you an example, but it seems the creative folks at Hulu have changed things up yet again within the past 24 hours.
Now, even if one successfully decrypts the PID, the old way of referencing the SMIL does not work. One gets the following error message from get_iplayer:
WARNING: Failed to get programme stream data for…
Here’s more background on the problem. It relates to the need to retrieve a separate authentication key of some sort.
http://xbmc.org/forum/showpost.php?p=293818&postcount=36
Also:
There’s an updated sec.swf. If you look inside it, you find that they’ve added a second decryption function, decs.
I haven’t studied it much further yet, but that could be how the authentication key is generated.
If you look inside the player, you’ll see that nothing uses the decs function. The auth key is generated from the pid, inside the
player.
I read your post at the xbmc forum about how to determine the authorization key:
http://xbmc.org/forum/showpost.php?p=294867&postcount=44
Great work!
Fyi, gnash 0.8.5 was released March 3 and no longer produces that RcInitFile garbage. I’ve also updated my huludif.txt file to add the “-1″ option, since the fscommand(“quit”) isn’t enabled by default.
The swfdec / decswf approach is still better though. And of course, replacing it all with a few lines of perl is even better.
The game is getting rather boring though. I’ve gone back to BitTorrent for a couple shows; they get there sooner and are in full HD x264. I already pay for cable TV, so I’m already entitled to watch these shows (and hell, they were free over the air in the first place). If Hulu wants to keep alienating users, fine, we can tell ‘em where to get off…
I’ve been busy with other work… I made some patches to my gnash source too to prevent that garbage from getting into the output (it’s a gnash bug, it should be sending that crap to stderr not stdout). Supposedly in the current gnash source tree those patches have already been integrated, but I haven’t had time to look. In the meantime, my swfdec approach with decswf.c is much cleaner.
I just thought I’d try your decswf.c and have it compiled on a Fedora 10 system. How exactly do I use it? I tried downloading sec.swf?cb=20090319, renming to sec.swf and then running
decswf file:///tmp/sec.swf 57885
Is this the right procedure? It either core dumps or returns nothing.
That’s correct, but it sounds like you’re not building against swfdec 0.9.2.
Thanks for confirming I’m doing the right thing. I just checked and I did build against swfdec 0.9.2. I’ll try on another platform. What did you use?
I used ubuntu 8.10 x86-64
Oh wait, you’re not giving it the right argument. The pid you have to feed in is a long hex string, like this:
283dcbb2497239e473e6174011e5f39f7d39c4bfad8ebb2f505a2b20806a2c9e~511aea2060bc458efb53c9741edcd1c6f43
For that pid you should get this result
8jp5e3uapsbMphB5DpSGzl6X0cANjvIZ
I succesfully compiled decswf.c on OSX Tiger and prepared a script that decrypts Hulu’s PIDs using decswf, but it seems that hulu changed http://releasegeo.hulu.com/content.select page.
Here’s what I get when I try to download House’s Unfaithful episode
unit1:~/get_iplayer alex$ ./get_iplayer.pl –rtmpdump=’./rtmpdump’ –ffmpeg=’./ffmpeg’ –hulu-decrypt-pid=’./hulu-decrypt-pid.sh’ –type=hulu –subtitles –get 200933 –verbose
get_iplayer v1.42, Copyright (C) 2009 Phil Lewis
This program comes with ABSOLUTELY NO WARRANTY; for details use –warranty.
This is free software, and you are welcome to redistribute it under certain
conditions; use –conditions for details.
INFO: User prefs dir: /Users/alex/.get_iplayer
INFO: System options dir: /etc/get_iplayer/options
Current options:
ffmpeg = ./ffmpeg
huludecpid = ./hulu-decrypt-pid.sh
rtmpdump = ./rtmpdump
subtitles = 1
type = hulu
verbose = 1
INFO: Search args: ‘200933’
INFO: Additionally getting cached programme data for hulu
INFO: got 2266 cache entries for hulu
Matches:
200933: House – Unfaithful, Drama
INFO: 1 Matching Programmes
WARNING: Cannot read /Users/alex/.get_iplayer/download_history
INFO: flashnormal,flashlow modes will be tried
INFO: Trying flashnormal mode to download hulu: House – Unfaithful
INFO: Getting page http://www.hulu.com/watch/59164
.INFO: CID=14371221
INFO: Getting page http://r.hulu.com/videos?content_id=14371221
.INFO: HPID=02e5e4d63d05de9051a5462a8b931a1ba971dedeb072462aece77f8e70d5434e~35d5a888426b4047b6407c2af34444988c7392816e4a7cb69de2a71a85716dfa
INFO: Running command ‘./hulu-decrypt-pid.sh 02e5e4d63d05de9051a5462a8b931a1ba971dedeb072462aece77f8e70d5434e~35d5a888426b4047b6407c2af34444988c7392816e4a7cb69de2a71a85716dfa’
INFO: Getting page http://releasegeo.hulu.com/content.select?pid=ah3WpIcQdCRCFs2X29ZAJknDc2a_V7te&mbr=true&format=smil
WARNING: Failed to get programme stream data for House from site
WARNING: Failed to get programme stream data for House from site
WARNING: Failed to get programme stream data for House from site
WARNING: No programme stream URLs were found for flashnormal, skipping
DEBUG: Download using flashnormal mode return code: ‘next’
INFO: skipping flashnormal mode
INFO: Trying flashlow mode to download hulu: House – Unfaithful
INFO: Getting page http://releasegeo.hulu.com/content.select?pid=ah3WpIcQdCRCFs2X29ZAJknDc2a_V7te&mbr=true&format=smil
WARNING: Failed to get programme stream data for House from site
WARNING: Failed to get programme stream data for House from site
WARNING: Failed to get programme stream data for House from site
WARNING: No programme stream URLs were found for flashlow, skipping
DEBUG: Download using flashlow mode return code: ‘next’
INFO: skipping flashlow mode
ERROR: Failed to download ‘House – Unfaithful (hulu-59164)’
Using Wireshark while watching the episode with Safari it appears that the content.select page is now /select.ashx and accepts two parameters: pid and auth.
Safari requested http://releasegeo.hulu.com/select.ashx?pid=ah3WpIcQdCRCFs2X29ZAJknDc2a_V7te&auth=cc3349e2f2bc0517df801e04615ea706and got a smil page with rtmp urls.
I changed your perl code
$prog->{metadata_url} = “http://releasegeo.hulu.com/content.select?pid=$hpid&mbr=true&format=smil”;
with
$prog->{metadata_url} = “http://releasegeo.hulu.com/select.ashx?pid=$hpid&auth=cc3349e2f2bc0517df801e04615ea706″;
and the episode downloads, but, as expected, others don’t… we have to get the correct auth parametere, but it’s late and I need to sleep…
Thanks for that – I’ve just released v1.43 incorporating that change. The –hulu-decrypt-pid program now needs to return ‘[decrypted pid] [auth string]’ on stdout.
Howard Chu’s method of determining the authentication, which he mentioned above in a reply to one of my posts, works well (at least until Hulu changes things up again!
)
The details are at this link:
http://xbmc.org/forum/showpost.php?p=294867&postcount=44
Since one needs md5, I found it easy to hack get_iplayer and use md5_hex from Digest::MD5 on the pid supplied via the –hulu-decrypt-pid program.
Here’s a simple PERL script to get auth parameter from the decrypted pyd
#!/usr/bin/perl
#
use Digest::MD5 qw(md5_hex);
$pid=”@ARGV”;
$pid_suffix=’****************************************************************';
$extended_pid=”$pid$pid_suffix”;
$auth=md5_hex($extended_pid);
print $auth;
The asterisks has to be changed with a “secret string” that is embedded in http://www.hulu.com/player.swf
Flare is a useful tool to decompile swf files and getting stuff…
Hi, I just wrote a set of scripts useful to retrive Hulu’s pid and auth parameters.
These tools can be used as helpers for applications like get_iplayer https://linuxcentre.net/getiplayer/
List of contents:
hulu-get-keys.c : Based on decswf.c by , this program must be compiled against
swfdec-0.9.2 and is useful to retrive pid encryption keys from enc.swf file
downloaded from Hulu.
hulu-get-keys.pl : This script first downloads player.swf from Hulu, decompiles it using flare (you
must download the command line version from http://www.nowrap.de/flare.html),
retrieves the auth encryption key from the decompiled source and stores it in
hulu.auth.keys file; then it downloads enc.swf from Hulu, uses hulu-get-keys to
retrieve pid encryption keys and store them in hulu.pid.keys file.
hulu-get-pid.pl : This script, based on a script by Andrej Stepanchuk, accepts Hulu’s encrypted
pid as parameter and, using hulu.pid.keys files, returns decrypted pid on stdout.
hulu-get-auth.pl : This script accepts Hulu’s decrypted pid as parameter and, using hulu.auth.keys
files, returns auth parameter on stdout.
hulu-get-pid-auth.pl : This script accepts Hulu’s decrypted pid as parameter and, using both
hulu-get-pid.pl and hulu-get-auth.pl, returns both decrypted pid and auth
parameter on stdout. This script can be used as get_iplayer helper using its
–hulu-decrypt-pid parameter.
You can download the package from http://www.megaupload.com/?d=G1ZKIOHQ
Locutus:
Great work, thanks!
A few comments which other users might also find helpful:
1) I see that your method calls the swfdec api to extract the copyrighted strings, rather than, as in Howard Chu’s original code, to actually perform the PID decryption. Then you separately perform Rijndael decryption on them. With all due respect, HC’s method seems more robust, as it makes no assumptions about what decryption lies within sec.swf. If hulu changes that, while leaving the dec function call, then HC’s decswf will still work while yours will break.
2) That having been said, since you use flare to extract the authorization key from player.swf, you could alternatively also use flare to grab the copyrighted strings from sec.swf, and skip the swfdec api calls entirely. Here’s a perl one-liner which prints the four copyrighted strings to stdout:
perl -ne ‘if (@cs=/47([\w (),.]{63,})47/g){print join(“\n”,@cs), “\n”;}’ sec.flr
3) I’ve noticed too that a simple “wget” on sec.swf won’t work. I find your work-around interesting. From reading the comments in the xbmc forum regarding how to get gnash to work, I discovered that one can also successfully “wget” sec.swf if one supplies a browser-like user agent. I’ve found this to work:
wget -nH -U Mozilla/5.0 http://www.hulu.com/sec.swf
Ooops…something screwed up when I did a cut-and-paste…
In my perl one-liner, the two instances of “47” should be “47” (the octal code for single-quote, ‘).
And you may have to change the “smart” single and double quotes in the string to ASCII.
BTW, there’s also a single space within the character class.
ARGH!!
Let me try this one last time: In the original perl one-liner, replace “47” with the four ASCII characters: backslash-zero-four-seven.
I’m out!
First let me say that I put together other people’s great work and just added a bit of mine.
About your points:
1) I totally agree that Howard Chu’s code seems more robust, but my goals was to provide redistributable code with no keys inside and to separate key extraction which relies on compiled code (hard to port to different platforms like Windows or embedded ones) from the code that actually performs the decryption which is completely scripted and can run on any machine that supports perl scripting. Doing so you can extract keys once in a while on a machine that can compile and run c code based on swfdec-0.9.2 (ie a Linux machine), get the keys and put them on another machine where you will run get_iplayer with my decryption scripts (ie a Windows machine or an embedded one). I think this approach can also be used as a base for a python XBMC script compatible with original Xboxes. My ultimate goal is to run get_iplayer on my Dreambox (a Linux SAT-PVR), where I think I can compile swfdec-0.9.2, hulu-get-keys.c and flasm (http://www.nowrap.de/flasm.html to be used instead of flare). As a final note I think that Hulu can break easily decswf.c, but I still agree that it’s more robust.
2) If you look at hulu.pid.keys generated with hulu-get-keys.pl you’ll see that it contains 5 keys, 2 more than the 3 ones you get from the flare/regex approach. The 2 extra keys are generated by crippled code in sec.swf.
3) It’s interesting. My method is the one actually used by player.swf to get sec.swf, but this simpler method seems to work fine by now.
I think the robustness of decswf’s approach is a moot point. The change after that is when they added the auth key, which broke things regardless of decswf. Hulu can keep on going, adding new and different things thus making it irrelevant how we approach any single piece of their obfuscation.
I personally like extracting the keys and just decrypting from perl; it’s a lot faster since it doesn’t have to initialize an entire actionscript virtual machine first.
Sorry, I don’t quite get what people above are saying. It seems they have some kind of break through. So…can we download stuff on Hulu now? Could someone summarise it please?
The method that works is kindly described by Locutus73 below.
Note that this method no longer works as of last week. Hulu are now encoding normal page requests to make them just serve up just large strings of hex that need yet another decoder…. The method is already known but I haven’t gotten around to porting it to perl with no obscure perl module dependancies (anyone?). See http://tunerfreemce.cvs.sourceforge.net/viewvc/tunerfreemce/TunerFreeMCE/Code/HuluTranslator.cs?view=markup specificaly the method _dobfu.
so currently there is no way to download hulu?
Yes – there is always a way somewhere! – just not using get_iplayer at present.
Easy part: download hulu-get.zip from http://www.megaupload.com/?d=G1ZKIOHQ and Flare from http://www.nowrap.de/flare.html.
Tricky part: compile swfdec-0.9.2 and hulu-get-keys.c.
Easy part: launch once in a while hulu-get-pid.pl to get Hulu’s decryption keys.
Easy part: launch get_iplayer.pl with –hulu-decrypt-pid ./hulu-get-pid-auth.pl parameter.
I did all this and I get the hulu.pid.keys but nothing in the hulu.auth.keys file. Howard Chu suspects this is due to a change in the SWF again… So unfortunately I cannot test or fix the problem in get_iplayer (yet).
I’ve just tried on my Mac and all went fine. I think that the cause could be:
1) Something went wrong downloading player.swf
2) Something went wrong decompiling player.swf. Does flare executable resides in the same directory as hulu-get-keys.pl (I call “./flare”, not “flare” nor “flare.exe”)? Does it produce player.flr?
3) Something went wrong parsing player.flr. Maybe flare produces slightly different code on my Mac, so we have to adjust the regular expression used to find auth keys.
Maybe you can contact me directly by email in order to sort this out making some simple debugging.
Thanks, it was (2). I had flare installed in /usr/bin/flare
How do i compile hulu-get-keys.c? there is no makefile.
has anyone managed to make this work under windows?
is there any way to download heroes on get_iplayer. i just get a message saying failed
using hulu and iplayer.
If heroes is on iPlayer then using the flash vmodes (flashhigh, flashnormal or flashvhigh) will work. If on hulu, well, that depends on what the Hulu obfuscation team have decided to change today…i.e. don’t bother…
I wasn’t able to get swfdec-0.9.2 to compile on cygwin. Actually, it didn’t even get pass the configure stage. Here is the error.
checking for GLIB… configure: error: glib-2.0, gobject-2.0 and gthread-2.0 >= 2.16 are required to build swfdec
Help please. Thanks!
dependency error, the glib in cygwin package is too old. i tried to compile it myself, it doesn’t work. i have given up.
Following Locutus73’s notes from above, I downloaded the hulu-get.zip package, and compiled swfdec-0.9.2 and hulu-get-keys.c successfully. I’m using Ubuntu 8.10, get_iplayer v1.59.
I try to to run the following command:
get_iplayer –type=hulu –pid http://www.hulu.com/watch/21660 –hulu-decrypt-pid=’./hulu-get-pid-auth.pl’
and this is the output I receive:
get_iplayer v1.59, Copyright (C) 2009 Phil Lewis
This program comes with ABSOLUTELY NO WARRANTY; for details use –warranty.
This is free software, and you are welcome to redistribute it under certain
conditions; use –conditions for details.
WARNING: Cannot read /home/luke/.get_iplayer/download_history
INFO Trying to download pid using type hulu
INFO: pid not found in hulu cache
WARNING: Cannot read /home/luke/.get_iplayer/download_history
INFO: flashnormal,flashlow modes will be tried
INFO: Trying flashnormal mode to download hulu: –
…WARNING: No programme stream URLs were found for flashnormal, skipping
INFO: skipping flashnormal mode
INFO: Trying flashlow mode to download hulu: –
.WARNING: No programme stream URLs were found for flashlow, skipping
INFO: skipping flashlow mode
ERROR: Failed to download ‘ – (hulu-21660)’
any ideas?
Luke, Hulu have added onother layer of obfuscation. Now, the SMIL file, from which the actual fms streaming video urls are extracted from, is encrypted as well. Ie this one for http://www.hulu.com/watch/65098/family-guy-transporter:
-> http://s.hulu.com/select.ashx?pid=mr9JdoQ0hSohpA6czVzNb8sjtx5ZZW5_&auth=0771fa84d8c6fac3250b9cabd60004e1&v=435984533
On top of that, de dsc.swf file, which decrpyts this thingie, has been ran through a code obfuscator and decompiling it with flare will not produce anything meaningful at the moment.
Note: this encryption has nothing to do with the _dobfu function.
Fun fun fun.