Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] Open source license (wikipedia)



A few days old now, but I'm not really confident of the answers given.

Darren Cook writes:

 > Semi-hypothetical question: If I take a bunch of text from wikipedia,
 > and make an md5 hash from it, can I release that hash code under a CC0
 > or CC-BY license? Or am I legally obligated to release it under the
 > CC-BY-SA license?

A hash code cannot be expressive, so not covered by copyright in the
U.S.  Haven't read the Japanese law recently but I believe it has the
same definition of "copyrightable".

 > My real question, of course, is can I train a machine learning
 > model on that text data, and release it under a more liberal
 > license? Assuming the model is effectively a one-way hash, and
 > cannot reproduce the original data.

That's a harder question.  It really depends on exactly what the model
does.  To the extent that the training corpus is multiauthor, I don't
see how you would capture a copyrigthable "expression fixed in some
medium".  If the functionality you provide is purely as a recognizer
or filter, that's not copying the expression.

Meta: It is a very bad idea to reason by analogy with respect to law
(here, ML model ~= hash).  Yes, that's what judges in common-law
jurisdictions in fact do, but (1) they have a very stylized set of
analogies that they use that make no sense to ordinary people (and
confuse the heck out of Real Lawyers (Right?) like Michael Cohen and
Rudy Guiliani), and (2) Japan is not a common-law country, so you have
to guess the personal analogies used by the judge you're gonna face.

Steve


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links