Discussion:
source code protection in compiled code?
(too old to reply)
Albert
2010-12-21 13:55:21 UTC
Permalink
We develop comprehensive code and algorithms in matlab, compile it with matlab compiler, link it together with C-Code (GUI, etc.) and deploy the application to customers. Now I have some concerns in savety of our source code and protection of intellectual property.

I understand, that compiled code is AES encrypted in the deployed CTF archives and remains encrypted even after extraction of the CTF archive.

nevertheless there are 2 important leaks:

- The key for encryption/decryption is known to mathworks, and there is no way for us to control leaks.
(Maybe there are already cracks available. Has anybody heard of it?)
- all original filenames of the source are visible after extraction, which is a first important hint on the applied algorithms.

Any ideas how to overcome this weakness?
Thx, Albert
Albert
2010-12-23 08:57:05 UTC
Permalink
Rune Allnor
2010-12-23 09:29:19 UTC
Permalink
Post by Albert
We develop comprehensive code and algorithms in matlab, compile it with matlab compiler, link it together with C-Code (GUI, etc.) and deploy the application to customers. Now I have some concerns in savety of our source code and protection of intellectual property.
I understand, that compiled code is AES encrypted in the deployed CTF archives and remains encrypted even after extraction of the CTF archive.
- The key for encryption/decryption is known to mathworks, and there is no way for us to control leaks.
(Maybe there are already cracks available. Has anybody heard of it?)
-  all original filenames of the source are visible after extraction, which is a first important hint on the applied algorithms.
Any ideas how to overcome this weakness?
First of all, become aware of what exactly you gain by
choosing matlab as platform: Speed during development.
You as developer can come up with a working solution
in a lot less time using matlab, than with most other
development platforms.

That's all.

Once one starts talking about other issues, like run-time
speed, ease/cost of distributing a solution, ease/cost of
maintaining the solution, various IP issues - well,
those are the kinds of things that have been low on
the list of priorities. If they were on that list at all.

If IP issues are a concern, then they need to be addressed
when making the strategig decisions about what development
platform to use.

Rune
Jan Simon
2010-12-23 12:38:06 UTC
Permalink
Dear Albert,
Post by Albert
- The key for encryption/decryption is known to mathworks, and there is no way for us to control leaks.
You have absolutely no chance to keep an encryption key 100% secret. The only secret. The only way to create secret passwards is using an esoteric high number of random bits.
If you do not trust TMW, don't use Matlab.
Post by Albert
- all original filenames of the source are visible after extraction, which is a first important hint on the applied algorithms.
As far as I understand, you want to sell software. I really hope, that you tell your customer in plain text, which algorithms you use! Any kind of obfuscation is overdoing then. If you want to hide the function names, just replace their names in the distributed package by func1, func2 etc.

Encryption does not really delete the underlying information, but just hides it. If there is no way to unhide it, it is not "encryption", but "deletion" of data. Therefore encryption always means, that it is made more expensive for a user or hacker to decrypt it than to buy the source codes from you. Nobody will spend one year of work to decrypt a P-coded progressbar, even if it is very nice. Using a secret key from TMW's CTF files is a violation of the license conditions and against the laws. Do you assume, that your customers are criminals? If so, give up any protection: Criminals tend to kidnap kids and ask you to send the source codes for free.

If you desire 100% security, this has psychological background, but is not useful or applicable from the view point of commercial interests or information technology. Remember that even Windows and Matlab itself has just a very limited level of protection! Do you remember the dramatically important copy protection of DVDs? Then ask google for "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" today.

Anyhow, as you can see from my answer, I've struggled with the same problems for a long time also. My final decision was to deliver 95% of the programs in plain text as M-files, and 5% as P-coded files, which make a not authorized copy a little bit harder. Then I decrease the prize of the program and increase the prize of my support. And finally I can sleep peacefully at night and drink some cups of coffee at day.

Kind regards, Jan
Albert
2010-12-23 13:23:04 UTC
Permalink
Post by Jan Simon
Dear Albert,
Post by Albert
- The key for encryption/decryption is known to mathworks, and there is no way for us to control leaks.
You have absolutely no chance to keep an encryption key 100% secret. The only secret. The only way to create secret passwards is using an esoteric high number of random bits.
If you do not trust TMW, don't use Matlab.
Post by Albert
- all original filenames of the source are visible after extraction, which is a first important hint on the applied algorithms.
As far as I understand, you want to sell software. I really hope, that you tell your customer in plain text, which algorithms you use! Any kind of obfuscation is overdoing then. If you want to hide the function names, just replace their names in the distributed package by func1, func2 etc.
Encryption does not really delete the underlying information, but just hides it. If there is no way to unhide it, it is not "encryption", but "deletion" of data. Therefore encryption always means, that it is made more expensive for a user or hacker to decrypt it than to buy the source codes from you. Nobody will spend one year of work to decrypt a P-coded progressbar, even if it is very nice. Using a secret key from TMW's CTF files is a violation of the license conditions and against the laws. Do you assume, that your customers are criminals? If so, give up any protection: Criminals tend to kidnap kids and ask you to send the source codes for free.
If you desire 100% security, this has psychological background, but is not useful or applicable from the view point of commercial interests or information technology. Remember that even Windows and Matlab itself has just a very limited level of protection! Do you remember the dramatically important copy protection of DVDs? Then ask google for "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" today.
Anyhow, as you can see from my answer, I've struggled with the same problems for a long time also. My final decision was to deliver 95% of the programs in plain text as M-files, and 5% as P-coded files, which make a not authorized copy a little bit harder. Then I decrease the prize of the program and increase the prize of my support. And finally I can sleep peacefully at night and drink some cups of coffee at day.
Kind regards, Jan
@ Rune: I think, as soon as mathworks offers and promotes tools for a continuous development chain from prototyping to deployment - like a compiler - it is legitimate to ask for IP protection in the delivery.

@ Jan: could you please focus on the technical questions instead of explaining me what business model to use? everybody knows about the risks of software cracks and plagiarism, and it is legitimate to protect oneself against it even if it is against the law and even if there is no 100% guaranty. otherwise we wouldn't even have to lock the door of our car ;-)

of cause replacing the function names is a good idea, but I have to parse all the sourcecode (and it is a lot) to replace all function calls as well.

kind regards, Albert
Jan Simon
2010-12-29 16:57:05 UTC
Permalink
Dear Albert,
Post by Albert
@ Jan: could you please focus on the technical questions instead of explaining me what business model to use?
Sorry if you are not interested in my personal opinion about solving your problem by asking if it is a problem at all.
Post by Albert
- The key for encryption/decryption is known to mathworks, and there is no way for us to control leaks.
So my answer was: If you think, that you have to control all leaks, but you cant, reduce the need to control them.
Post by Albert
of cause replacing the function names is a good idea, but I have to parse all the sourcecode (and it is a lot) to replace all function calls as well.
Let Urs do the parsing for you:
http://www.mathworks.com/matlabcentral/fileexchange/17291
The automatic replacing is not trivial, because you cannot simply call STRREP, if the function names are part of other symbols or strings. And if names of local variables shadow function names, the automatic replacing inserts bugs. So at first I'd ask Matt Figs:
http://www.mathworks.com/matlabcentral/fileexchange/27853
for shadowed function names.
Automatical renaming of functions fails also, if you construct the function names dynamically for EVAL, EVALIN or such ugly methods.

Unfortunately SYMVAR is not powerful enough to recognize all function calls reliably. E.g. the subfunction to recognize quoted parts does not handle this correctly:
S = 'This is a quote char: '''
But I assume, that you can modify Urs' and Matt's function, such that the funcion names are replied together with their indices in the source codes, to construct a automatic code obfuscator.

Kind regards, Jan
Albert
2011-01-03 15:43:06 UTC
Permalink
Thx Jan for your answer, which is indeed more helpful than the last one ;-)

I know 100% protection is not possible, but maybe my post is an impact for mathworks, to improve safety. In my opinion, this is could be done quite easy by two things:
.) give the user the possibility to define an own key for encrytion at compile time
.) scramble the function names at compile time, which should be easy, because code has to be parsed syntactically anyway.

Best regards,
Albert
Jan Simon
2011-01-03 20:09:04 UTC
Permalink
Dear Albert,
Post by Albert
Thx Jan for your answer, which is indeed more helpful than the last one ;-)
Really?!
In my eyes my first suggestion vaporizes the problem, while the second adds alot of work to your todo list - an annoying kind of work, because it is useful only, if your customers are criminals.
So I prefer the "shareware" approach, but of course, this is _your_ thread ;-)

There are code obfuscators for a lot of programming languages, but I do not know a one for Matlab. Anyway, I would not deliver an obfuscated program without exhaustive and excessive tests.

Kind regards, Jan
Walter Roberson
2011-01-03 20:31:59 UTC
Permalink
Post by Albert
I know 100% protection is not possible, but maybe my post is an impact
for mathworks, to improve safety. In my opinion, this is could be done
..) give the user the possibility to define an own key for encrytion at
compile time
You are operating under a mistaken notion that Mathworks knows the AES key.

http://www.mathworks.com/help/toolbox/compiler/bsfey7f.html

'All the MATLAB files from a given CTF archive associate with a unique
cryptographic key. MATLAB files with different keys, placed in the same
CTF archive, do not execute. If you want to generate another application
with a different mix of MATLAB files, recompile these MATLAB files into
a new CTF archive.'

http://www.mathworks.com/support/solutions/en/data/1-2ZAVUJ/index.html?product=CO&solution=1-2ZAVUJ

'In the MATLAB Compiler documentation, it states: "Compiler 4 also uses
a Component Technology File (CTF) archive to house the deployable
package. All MATLAB files are encrypted in the CTF archive using the
Advanced Encryption Standard (AES) cryptosystem where symmetric keys are
protected by 1024-bit RSA keys."'


Keep in mind that the same MCR runtime is used for all depolyed
applications (of the same compiler version.) Therefore MCR needs _some_
way of decrypting the CTF files, and allowing the user to specify their
own AES key would not remove that problem.

I gather from the description that the compiler generates a unique AES
key for each CTF file, and that it writes the AES key as part of the CTF
file by encrypting it with one half of a fixed RSA key pair. Then at
execution time, MCR decrypts the AES key using the other half of the
fixed RSA key pair, and uses the reconstituted AES key to read the AES
encrypted archive.

If my inference from the documentation is correct, then because
Mathworks has access to both halves of the RSA key pair, Mathworks could
decrypt the specific CTF AES key and use that to decrypt the files. This
is a bit different in detail than what you were discussing earlier, but
the overall effect would still be that Mathworks, if they had access to
your CTF, could decrypt it, as could anyone who managed to
reverse-engineer MCR enough to figure out the RSA key.

Having people able to select their own AES key would not help maintain
any privacy in the above situation.

In order to improve upon the security, it would be necessary to generate
a *new* MCR for each deployed application.
Bruno Luong
2011-01-03 20:47:06 UTC
Permalink
Post by Walter Roberson
In order to improve upon the security, it would be necessary to generate
a *new* MCR for each deployed application.
Or not generate any MCR at all, but do the real compilation to binary code.

Bruno

Loading...