Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][tlug] Open source license (wikipedia)
- Date: Thu, 3 May 2018 22:11:15 +0100
- From: Darren Cook <darren@example.com>
- Subject: [tlug] Open source license (wikipedia)
- User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0
Semi-hypothetical question: If I take a bunch of text from wikipedia, and make an md5 hash from it, can I release that hash code under a CC0 or CC-BY license? Or am I legally obligated to release it under the CC-BY-SA license? My real question, of course, is can I train a machine learning model on that text data, and release it under a more liberal license? Assuming the model is effectively a one-way hash, and cannot reproduce the original data. My hunch is yes, it is allowed; but I'd love a pointer to an authoritative source. As some background, we've been hacking on a Japanese tokenizer, https://github.com/rakuten-nlp/rakutenma, which is under MIT license. But they trained on BCCWJ, and Rakuten would have paid the extra 400,000 yen to allow releasing that model (as "research results for commercial use"): http://pj.ninjal.ac.jp/corpus_center/bccwj/en/fee.html If/when we do release our version, I think it'd be better to have it come with a model built from open data, such as Wikipedia. And, if at all possible, I want to stick with MIT/CC0/CC-BY licenses. Google are able to sell n-gram data, with their own usage restrictions, that they have trawled from the Internet, (https://research.googleblog.com/2006/08/all-our-n-gram-are-belong-to-you.html), which implies to me that you can relicense statistical analysis of data under any license you choose. But maybe it is more complicated than that? Thanks, Darren
- Follow-Ups:
- Re: [tlug] Open source license (wikipedia)
- From: Raymond Wan
- [tlug] Open source license (wikipedia)
- From: Stephen J. Turnbull
Home | Main Index | Thread Index
- Next by Date: Re: [tlug] Open source license (wikipedia)
- Next by thread: Re: [tlug] Open source license (wikipedia)
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links