Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Are we all plagiarists now? (economist.com)
89 points by pseudolus 4 hours ago | hide | past | favorite | 104 comments




Here is my very simple view:

- exact reuse of a long-ish word sequence(s) without credits -> not cool.

- complete/partial reinterpretation of an already existing story in different words -> it's fine

- Traced/almost identical image/drawing/painting (with the intent to fool someone) -> not cool

- Visual imitation in style/content but different approach or usage -> it's fine

I think people are too attached by the novelty of something, sure if I write a bunch of words and you repeat them as yours, that's not cool. But if something I make inspires someone and they take it, reframe it, rephrase it or whatever, go ahead.

People adore Star Wars, which is an absolute one to one of a hero's journey, it still has value. Most modern fantasy are basically fanfics of Middle Earth, still good that they exist.

Imagine someone just spamming sequences of notes at random for their whole life, does it mean they own anything else made here afterwards +70/80/90... Years?


The law broadly agrees with you here.

Non transformative use -> Not cool.

Transformative -> it's fine

Original work attempting to deceive or confuse the origin as being by another. -> not cool

Original work emulating the style of another without attempting to imply involvement of the other -> it's fine.


Derivative work isn’t automatically allowed under copyright law regardless of whether you’re trying to “deceive” people or not.

Depends if it's sufficiently transformative or not.

> People adore Star Wars, which is an absolute one to one of a hero's journey, it still has value.

Yeah but A Hero's Journey is not a literal story, it's more of a framework written in a book called "The Hero With a Thousand Faces" for what makes a story interesting and how various original stories like myths, folklore etc (like the Bible) always followed the same pattern.

The author dissected that pattern, and then it has been followed by many writers/creators for what is considered to be a good model of a story. Screenwriting classes literally teach it, along with other stuff like The Three Act structure etc.

And if you really look into, almost all good stories follow that pattern to some extent, but it is the implementation that makes each story special.

It's like a bit like saying "People adore [x] webapp which is an absolute one to one of React, it still has value" but both are fundamentally different things.


I think this is correct, and that it's school (which with good intentions) overemphasises the importance of complete originality

It's less about originality than crediting sources.

If I restate something using completely my own words, I'm still supposed to cite the source where I got the idea.

If something is completely my own invention, and I didn't use any sources to create it, then that's original and I don't need to credit anyone else. But that's very rare.


how do you account for the compilation of your insight that was formed through the consumption of many prior examples? do you feel compelled to thoroughly cite them, or have they crossed a threshold marked through your ability to now generate new similar things without directly referencing them that it's "all original you" now?

Yeah there's some grey area there I guess. But it took me quite a while as a student to understand that I needed to cite sources even if I was "using my own words" and not quoting passages verbatim.

Certainly there are styles and broad arcs that many creations follow that are not directly attributable to a specific source.


If you're writing an academic/research paper, you still have to find something to cite.

"I know this stuff, just trust me" isn't a valid citation. The point is to give anyone who reads the paper a way to a) verify that each fact you put in the paper has solid academic sourcing, and b) find more information about it if they wish.

If you know a lot of stuff about the topic already, that's great—but unless you've already written and published papers on the subject, you can't just cite yourself.


Also at some point citing is not needed. If I use addition I do not need to cite relevant parts of for example Principia Mathematica.

In the end hard lines are very hard.


Everything Is A Remix.

Producing something entirely novel in an act of pure creativity is essentially a tall tale - like Newton and the Apple - possibly some truth to it, but definitely mythologized.


I don’t think this is entirely correct mutants exist. Everyone while in nature something goes wrong. Something random happens. You get something novel and new. This happens and creativity as well so most things are remix but entirely new novel things do exist because the world is not static it is random

> Universities are increasingly turning to AI to spot AI-written work (even as students use services like Dumb it Down to make their AI-fuelled work sound more believable). It can be detected. Chris Caren, the boss of Turnitin, a popular plagiarism detector, describes plagiarised prose as “beige”: “well-written, but not very dynamic”. It has verbal tics: it is keen on dreary words like “holistic” and notably keen on “notably”.

I don't think you can say that AI-written can be reliably detected. Turnitin is only ~90% effective: https://teaching.temple.edu/sites/teaching/files/media/docum...


I tried a lot of these tools, including Turnitin, and I think they are all wrong. Not because they are a bad implementation, but just because the problem is naturally impossible in a lot of cases.

There are people whose style is closer to AI, that doesn't mean they used AI. And sometimes AI outputs text that look like a human would write.

There is also the mix: if I write two pages and I used two sentences by AI (because I was tired and I couldn't find the right sentence), I may be flagged for using AI. Even worse, if I ask AI for advice and then I rewrite it myself, what would be the output? I can make a reasoning that both (AI written and not AI written) would be wrong.


> There is also the mix: if I write two pages and I used two sentences by AI (because I was tired and I couldn't find the right sentence), I may be flagged for using AI.

None of these tools are binary. They give a percentage score, a confidence score, or both.

If you include one ai sentence in a 100 sentence essay, your essay will be flagged as 1% AI and nobody will bat an eye.


They are not binary but the score isn't linear in my experience either. It isn't that they assign a score to each sentence and then do an aggregation.

It's not, but the fact that one sentence deserves a high score doesn't automatically mean that entire thing will flag false positive. Unless it's like, two sentences in total.

Yeah, and to be blunt, beige and not dynamic is how I would describe most student writing done entirely by the human. I just don't see how a model, trained on a vast corpus of such writing, could ever be successfully and reliably distinguished from human writing. You can distinguish good writing from so-so writing, that's about it.

In an educational context, the only purpose of the writing has traditionally been learning, and the purpose of turning it in has been to prove that the learning took place. Both of those are out the window now. Classroom discussion and oral presentations might be the only place you can still prove learning took place. Until everybody gets hidden AI-powered earpieces of course.


I take suspicious student papers and feed them to Turnitin, as well as the popular LLMs. Hey ChatGTP, give me a report on the likelihood that this paper was generated by an LLM. Do that with Gemini, Claude, etc.

Then if there's a high probability, I look through the references in the paper. Do they say what the student attributes to them?

Finally, if I still think it's AI-generated, I have the student in and ask questions about the paper. "You said this here in this paragraph -- what do you mean by that?"

AI detectors are a first-pass, but I think a human really needs to be in the loop to evaluate whether it's cheating, or just using something to clean up grammar and spelling.


> [can’t] be reliably detected… only ~90% effective

I’m surprised to see these comments in conjunction, 90% is pretty good, and much higher than i expected. I wonder what’s the breakdown of false positives/false negatives

Edit: from the linked paper

> Of the 90 samples in which AI was used, it correctly identified 77 of them as having >1% AI generated text, an 86% success rate. The fact that the tool is more accurate in identifying human-generated text than AI-generated text is by design. The company realized that users would be unwilling to use a tool that produced significant numbers of false positives, so they “tuned” the tool to give human writers the benefit of the doubt.

This all seems exceptionally reasonable. Of the samples with AI, they correctly identify 86%. Of the samples without AI, they correctly identify a higher proportion, because of the nature of their service. This implies that if they _wanted_ to make a more balanced AI detection tool, they could get that 86% somewhat higher.


>90% is pretty good, and much higher than i expected.

Problem with that at scale is those that might skirt by within that 10% might one day be your doctor, your lawyer, or your accountant and you'd never know until it bit you in the ass.


> I’m surprised to see these comments in conjunction, 90% is pretty good, and much higher than i expected.

What standard of proof is appropriate to expel someone from college? After they've taken on, say, $40,000 of debt to attend?

Assuming you had a class of 100 students, "90% effective" would mean expelling 10 students wrongly - personally I'd expect a higher standard of proof.


Anyone expelling a student over a single “ai” label from turnitin alone is a complete idiot. Perhaps that happens occasionally, but that’s clearly the result of horrible decision making that isn’t really turnitins fault.

Anyone who gives 10 seconds of thought to how this could help realizes at 90% it’s a helpful first pass. Motivated students who really want to hide can probably squeak past more often than you’d like. And you know there will be false positives so you do something like: * review those more carefully, or send it to a TA if you have one to do so * keep track of patterns of positives from each student over time * explain to the student it got flagged, say it’s likely a false positive, and have them talk over the paper in person

I’m sure decent educators can figure out how to use a tool like that. The bad ones are going to cause stochastic headaches for their students regardless.


That's not what 90% effective means. Tests don't work that way.

Tests can be wrong in two different ways, false positive, and false negative.

The 90% figure (which people keep rounding up from 86% for some reason, so I'll use that number from now on) is the sensitivity, or the abitity to not have false negatives. If there are 100 cheaters, the test will catch 86 of them, and 14 will get away with it.

The test's false positive rate, how often it says "AI" when there isn't any AI, is 0%, or equivalently, the test's "specificity" is 100%

> Turnitin correctly identified 28 of 30 samples in this category, or 93%. One sample was rated incorrectly as 11% AI-generated[8], and another sample was not able to be rated.

The worst that would have happened according to this test is that one student out of 30 would be suspected of AI generating a single sentence of their paper. None of the human authored essays were flagged as likely AI generated.


Expulsions don’t happen. International students have been cheating rampantly for decades. Universities are happy enough to collect their tuition.

My son, who just finished his first semester at college, said the thing that surprised him the most was the blatant cheating all around him. He said it is rampant and obvious, and the professors don't seem all that eager to punish it. It pisses him off, because it puts him at a disadvantage because he doesn't want to cheat.

You can read the linked article, they break down their analysis in detail. Seems like low false positives at least.

Edit: thanks for doing so


> Turnitin is only ~90% effective:

No it isn't. Stop.

The cynical part of me says that the people who share this link with that summary are the cheaters trying to avoid getting caught, on the basis of the fact that they are patently abusing the numbers presumably because they didn't pay attention in math class.

The tests are 90% SENSITIVE. That means that of 100 AI cheaters, 10 won't be caught.

The paper you linked says the tests are 100% SPECIFIC. That means they will *never* flag a human-written paper as mostly AI.


Honestly reading that article made me more less worried about AI-detection. My main concern is false positives (incorrectly identifying a human-written text as AI-written), but it seems Turnitin got that close to 0.

Of course the sample size is fairly small, I would want a larger scale study to see if the false positive rate is actually 5%, or 1%, 0.1%, 0.000001%, etc.


+1, i feel they’ve done a pretty good job, and have balanced the trade offs well

Turnitin is in weird spot. And probably impossible one. Academic writing is trained to be academic writing. With mesta text and phrases. And students and writers tend to follow conventions they see in other academic texts. As do AI.

On some level the human output in academic setting is expected to be well formulaic in way AI generated text is.

Which often could lead to false positives.


What would be high enough? I agree 90% isn't perfect, but neither are LLMs.

What can you do with 90%? Accuse people of plagiarism and ignore the fact you will hurt 10% of innocent people, while still allowing 10% of cheaters? Of course there's ambiguity in the "accuracy" term, but I assumed you can be inaccurate in both directions.

Actually, you're allowing a much higher percentage of cheaters if you read the paper. They optimized to avoid false accusations. It's only ~45-75% accurate at detecting AI writing. It's closer to 90% accurate at detecting human writing. Half the cheaters get through, and you still fail 10 percent of the people who didn't cheat.

> It's closer to 90% accurate at detecting human writing.

I know that's what they wrote, but I heavily disagree. It got 28/30 (93%) correct, but out of the two it got "wrong":

- one was just straight up not rated because the file format was odd or something

- the other got rated as 11% AI-written, which imo is very low. I think teachers would consider this as "human-written", as when I was being evaluated with Turnitin that percentage of "plagiarism" detected would have simply been ignored.


At this point the most basic users of could be easily picked off and that style and list will grow yearly.

> Of course there's ambiguity in the "accuracy" term, but I assumed you can be inaccurate in both directions.

The linked article breaks it down. The measured false positive rate is essentially 0 in this small study.


Are you going to fail 10% of students who did their own work because they supposedly cheated? What exactly can you do with this 90% accurate judgment from a black box? Perhaps not let them out on bail?

No, read the paper. They're going to pass 10% of students who cheated. The 90% figure is the false negative rate, how many AI essays it says are human.

The false positive rate is 0. The tool *never* says human writing is AI.


> The false positive rate is 0. The tool never says human writing is AI.

That cannot be true as it would be easy for a human to write in the style of AI, if they choose to. Whoever is making that claim is lying, because money...


> Are you going to fail 10% of students who did their own work because they supposedly cheated?

The linked article analyzes their data into more detail. In particular, the measured false positive rate is essentially 0 in this small study.


90% accurate doesn't mean 10% false positives, I'd want the 90% accurate to be 100% accurate all of the time.

This isn't zoolander math. or is it.


If I get AI to generate an essay and rewrite every word with my own whilst keeping the same general meaning of the original text, surely there’s no reasonable way to detect that, right?

I mean, the solution is just in-class-only essays, right? Or to stop with the weird obsession with testing and just focus on actually teaching.


Just don't grade essay? Make it clear that eassy are optional and not required to get a grade, but it's a good way to learn. That will cut down the amount of work to be done too.

They failing exams because they don't do the work is on them.


I think IP kind of breaks a lot of engineers' brains (despite how much of it they create) because a lot of IP law is about intent, and their theory of mind is so bad that the idea of a body on law based on deduced intent, and the ways a court might deduce their intent if they used someone else's IP, are totally alien to them.

You’ve all been very silly with the idea of intellectual properties, copyright specifically.

Every generation throughout time has had the right to recreate the legacy of human thought through the filter of their own times.

“Cultural appropriation” and other knock off terms are objectively a part of every creative and functional cycle.

Give credit where credit is due, yet once let into the world a thought becomes a part of such wilds.


The problem is really that we live in a system that demands we find commercially exploitable value in almost everything we do. If my main strategy for that involved a skill that generative AI could perfectly copy, including my style by invoking my name, I'd be pissed too.

Not to mention that when it comes to art, I'd rather consume something that someone deemed important and interesting enough to dedicate skill and time to.


>The problem is really that we live in a system that demands we find commercially exploitable value in almost everything we do.

Demands? Almost everything we do? I only spend 40-50 hours a week max doing labor that anybody would reasonably describe as being commercially exploited. No one’s broken down my door demanding I start making money on the visual novel I’m drafting in Ren’Py on the weekends, nor have I been castigated by my peers for throwing a party without charging an entrance fee.


Good for you, genuinely!

Far from a universal experience though. People who rely on art to survive right now. People who don't have the energy to do more "productive" work on weekends. And those that work weekends to survive because they don't make a living wage.


>The problem is really that we live in a system that demands...

The problem is a system of strong copyright laws isn't going to fix this system, and from everything we've seen is making it worse.


Well, we're already seeing very asymmetric enforcement. Systems surrounding capitalism tend to bend in favor of those who already benefit from it.

There's plenty of commercially exploitable value in knowing that something was hand-crafted or even just endorsed by someone famous and impressive, and is not just a second-rate, mass-market knockoff. AI doesn't change that in any way. If it means that celebrated artists can now create even better art on an even broader scale, that's a commercial win for them. Plagiarism would not be an issue at all.

When it comes to art, I'd rather consume something that is interesting/meaningful/beautiful/revolutionary/etc. It's all about the thing itself; it has always been. Less ego in all of this could actually be a good thing.

Every artist of worth has sought two things: to bring something of beauty into the world, and to be regarded in their worth in proportion to the greatness of their art. You suggest we take from them the beauty they have brought forth, and leave them with nothing in return but scorn. What then has the artist to gain from their endeavors, when not only are they the be ridden of the significance of their authorship, but that then their works are to be put to use to feed the all consuming machine which benefits not the artist but those running the machine?

Art is not just about the thing produced, stripped of its context and significance, and forced to be interpreted by ignorant minds who, in their ignorance, consider themselves capable of deriving meaning of value out of words and pictures they can scarcely comprehend from their own limited perspective.

The significance of all art is derived from its historical context, the authors implicit intentions and mode of creation, and the unique experience generated from an individual consuming the art. If you suggest only the consumers experience matters, then you are free to forgo the greater appreciation of art in favor of the lessened experience of it if you wish. For greater awareness and understanding of the details of the parts allow us to better understand the significance of the whole. Only art that is of little value is lessened by our deepened understanding of it.


I don't think I disagree strongly. But I also don't think generative AI tools will do that just based on how they're built. Everything they can do, someone probably did better from scratch.

my point of view changed when I had to step over dying people to get into my new studio space --some famous artist

On top of this, Tech Bros want to capitalize on your talent like white bands covering Black artists during segregation.

There's a forced reframing going on of what it means to be an artist, and what it means to appreciate artistry. Over time we've developed the idea that art, once created, is not free for the observing; the artist has a right to compensation.

It's an understandable position for these reasons:

- We like art and we ant to show our support and appreciation for art

- The most straightforward way to show support and appreciation for art is to give the artist money

- Much of the art we appreciate was only possible due to the promise of monetary gain on the part of the artist

But there are some old, unavoidable questions:

- At what point does the pursuit of monetary gain begin to diminish one's own artistic expression?

- At what point does the pursuit of monetary gain begin to diminish other peoples' artistic expression?

As you point out, there is no art without appropriation and re-creation.

And now there are some new, unavoidable facts:

- Appropriation is becoming easier

- Attribution is therefore becoming more difficult

- Compensation is therefore becoming more difficult

- Rewinding the clock is impossible

The only way out of this would be for humanity to collectively take a puritanical stance on art, where any form of appropriation is demonized. I think this would make art suck.


>- The most straightforward way to show support and appreciation for art is to give the artist money

but it is quite notorious that people don't actually like doing that point, especially, I just have to point it out here, on HN. So...

At what point does the inability of monetary gain begin to diminish artistic expression?


> Appropriation is becoming easier

my deck BBQ caught on fire, problem .. versus ... the 35,000 hectares next to my house is on fire with 20 meter tall flames

is "appropriation" now "easier" ? for whom, at what scale to deliver? at what scale to ingest ?


The analogies you're trying to connect are suspect at best.

Call me when OpenAI gives away all its intellectual property for free.

We are all plagiarists the moment we touch AI

Every generation throughout time didn't have to compete with massive instant access to everything ever written to facilitate plagiarism, or with AI generated slop...

And everything wasn't "content", nor did they have massive numbers of influencers and public content creators, nor was there was a push even for laymen to churn heaps of text every day or to project an image to the whole world.

And until recently if you got caught plagiarizing you were shamed or even fired from journalism. Now it's just business as usual...


This kind of “oh everybody does it” dismissiveness towards cultural appropriation comes off as possibly ignorant but awfully insensitive. What is your understanding of the term? What does it describe, and when people use it as a negative, what legitimate issues are they concerned about?

> “Cultural appropriation” and other knock off terms are objectively a part of every creative and functional cycle.

You'd think it was more complicated than that if the people who were doing a caricature of you had enslaved and murdered your family, and lived in the house your family built while you lived on the street.

It doesn't matter, because culture works how it works (and is often used as a political tool), and somehow world culture ends up being people pretending to be Americans pretending to be the descendants of American slaves. But it's undeniably ugly.


"Cultural appropriation" is a totally separate issue to intellectual property and copyright. You're muddying the waters by conflating the two.

Cultural appropriation was a term popularised in the heady days of woke excesses when white liberals were desperate to find reasons to be mad at one another for perceived impurity. It's a ludicrous concept from top to bottom.

Intellectual property laws, in my opinion, have a place in our society.


This account was created one hour we should ignore this comment. :)

Think about how it feels when you toil on a hard problem, do your best work, release it to the work in the spirit of openness and sharing

Only to have a machine ingest, compress, and reiterate your work indefinitely without attribution.


> Only to have a machine ingest, compress, and reiterate your work indefinitely without attribution.

Everything I write, every thought I have, and the output of my every creative endeavor is profoundly shaped by the work of others that I have ingested, compressed, and iterated on over the course of my lifetime, yet I have the audacity to call it my own. Any meager success I may have, I attribute to naught but my own ingenosity.


Here you make alike a human to a machine, telling of our times to fail to see the difference.

    Only to have a machine ingest, compress, and reiterate your work indefinitely without attribution.
Further facilitating millions, or even billions, of other people to discover new ideas and create new things. It's not hard to see the benefit of that.

I get that the purpose of IP laws are psychological, rather than moral. A culture where people feel as though they can personally benefit from their work is going to have higher technological and scientific output, which is certainly good, even if the means of producing that good are sort of artificial and selfish.

It's not hard to imagine, or maybe dream of, a world where the motivation for research and development is not just personal gain. But we have to work with the world we have, not the world we want, don't we...

Nobody will starve themselves, even if doing so will feed hundreds of others.


> the purpose of IP laws are psychological, rather than moral.

Neither. They are purely economic. You even acknowledge this when you call out personal benefit.

The stated intent is to facilitate creators realizing economic benefits from time spent creating. The reality is that large corporations end up rent seeking using our shared cultural artifacts. Both impacts are economic in nature.


Right, right.

The economic benefit is derived from a psychological effect: the expectation of personal gain.

The economy as a whole benefits from technological progress. The technological progress is fueled by each individual's expectation of personal gain. The personal gain is created via IP law.


If someone shows up to work based on the expectation that they will receive a paycheck at the end of the month would you also describe that as a psychological effect? I certainly wouldn't. That's an economic activity.

There's a psychological component regarding trust. Either that your employer would never try to cheat you or alternatively that your employer is the sort that might try to cheat you but won't thanks to our society's various systems. But the showing up to work itself is a simple exchange of time and skill for money.


In the case that the IP, and thus the financial benefit, is not owned by an individual, but owned by a large corporation, as with your example, what does the individual care whether or not the IP is infringed?

They don't. In this case democratizing the IP is more likely a social / economic benefit, not a harm.

We're talking about intellectual property rights, the benefits of which only go to the intellectual property holder.

Although how big a corporation has to be before we cross the line from social / economic harm to social / economic good is an interesting question.


You are conflating work and work product. There's a difference between being acknowledged and compensated for doing hard work, and receiving property rights over the work product.

If you are an employee, you get paid for building something (work), and the employer owns the thing that was built (work product). If you are self-employed, it's the other way around. You don't get paid for the work, but you own the work product. Employees generally don't work for free, and the self-employed generally don't give away their capital for free.

If you opt to "release it to the [world] in the spirit of openness and sharing," then you built capital for free and gave it away for free. If you didn't want others to capitalize on the capital, then why did you give it away?

If you want attribution, then either get paid for the work and add it to your resume, or exchange your work product for attribution (e.g., let people visit the Jryio Museum, build a Jryio brand, become known in your community as a creative leader, etc.). If you give it away for free, then your expectations should include the possibility that people will take it for free.


I would be fine with it personally. But I'm a mathematician not an artist.

Are those feelings serving you?

What consideration do you choose to afford to those feelings?


I'd highly recommend RiP!: A Remix Manifesto [Grokipedia](https://grokipedia.com/page/rip_a_remix_manifesto) [YouTube](https://youtu.be/quO_Dzm4rnk). It's been a long time since I watched it but I'll have to re-watch it. Keep in mind this came out almost 20 years ago, well before LLMs. I don't know where we'll land but humans are also remix machines, that's what creativity is, I think the beauty of what LLMs are is the first technology that has captured some essence of creativity. Of course, it's lacking direct emotion/sentiment but I also think people are failing to understand the role of the human in using the LLM. We are the final filter that is recognizing quality, projecting emotion/sentiment, etc.

Cutting back the power of creators dramatically increases the power of distributors. Do we really want the vast majority of economic benefit for human creativity to flow to middlemen?

And strengthening copyright causes the distributors to assign themselves the new copyrights in take-it-or-leave-it contracts. Making author's rights non-transferable (as in, e.g., Germany) goes some way to preventing this.

That's been the trend, yes.

Look how much power lies in the hands of people who lie between petroleum in the ground and its combustion. It's a whole waterfall and the majority of the "wealth" in society seems to consist of people who're spinning their wheels from siphoning from it. And now they're terrified it'll go away.

The AI "gold rush" really has this feeling. "How can I get my finger in the pie somewhere here?"

"All that is solid melts into air"


> How can I get my finger in the pie somewhere here?

Given the performance of open weight models to date it looks as though that might prove fairly difficult in the medium to long term.


> What counts as intellectual theft—and what is considered acceptable borrowing or inspiration—are the great questions of the AI era.... Part of the problem is that there is no precise definition of plagiarism—it ranges from verbatim copying to the fudgier theft of concepts.

> ...

> Moreover, using ChatGPT, Mr Kreuz argues, does not make you a plagiarist, since it is not cribbing from a single “original” text. He suggests LLMs are doing unacknowledged “ghostwriting”. To many that is too generous: this is still plagiarism, but with an AI accomplice.

Meanwhile, the site itself offers AI-powered narration for the article. The AI voice's vocal characteristics had to come from somewhere. Is it any different for prosody than for prose?


Without arguing the broader topic, I do think there's an important distinction between plagiarism in fiction and non-fiction or academic work: The theft of ideas

In fiction, taking ideas (hero's journey, middle earth, etc)[1] and adapting to a new story/characters is totally fine without attribution. There's probably only like 5 stories ever that just keep getting re-written this way.

But in non-fiction, academic research and the like, stealing ideas without attribution is a problem, because ideas are the whole point. Nobody reads a research paper for the plot.

But in school, and especially with non-fiction, we're so often told to "just re-word it to make it your own" which is actually the most insidious form of plagiarism. If I get an idea from you and want to include that in my paper, that's great, but I have to give you credit. Great non-fiction books I've read are riddled with citations and have 100-page bibliographies. The value of the book/paper is (often) in the synthesis of those ideas into something new, with maybe it's own ideas added on top. But "re-wording" does not make and idea your own, and does not escape a charge of plagarism.

[1] top comment as of this writing


> But in school, and especially with non-fiction, we're so often told to "just re-word it to make it your own" which is actually the most insidious form of plagiarism.

I think you might be confused, or had unclear teachers.

You're told to re-word it but still cite it. There are different combinations here:

1. Copy verbatim, no quotes, cite. Plagiarism, because you're copying the wording without quoting (even though you're citing).

2. Copy verbatim, quote, cite. Correct.

3. Paraphrase, no quotes, cite. Correct.

4. Paraphrase, no quotes, don't cite. Plagiarism if not "common knowledge".

Teachers should be telling you to do 3 rather than 1. You are maybe confusing 3 with 4, thinking they were telling you to do 4? (Or your teachers were just wrong?)

But the difference between 3 and 4 can actually get legitimately confusing in certain cases, even for academics, because there are a lot of ideas that are just "in the air" and it's not always clear if something is "common knowledge" or if there's some original citation for it somewhere.


Here's question, if nobody had ever written science fiction, would AI do it.

I don't think so.


Science fiction is as old as fiction. The Epic of Gilgamesh (2000BC) and Ramayana (500BC) have sci-fi elements. There's nothing innovative or unique about stories that imagine a future instead of a past, present, or alternate reality.

Genres are too vague and generic to be ownable by anybody. Inspiration is not plagiarism.


There is some LLM trained only with old content, maybe someone can describe a style invented after the more recent content and told the AI to do a novel with that style and see what the result is and compare it to the "real" style.

As far as I am aware AI wont write/do anything without an input prompt.. or has something changed ?

I've built a react invoicing tool, I can't help but think that it probably ripped of a bunch of code. I've added my own touch now but it seems like it was wat faster with generation.but then again, it's hardly rocket science.

More like maybe we are acknowledging "intellectual property" was always a fiction

There’s two issues here. Plagiarizing as stealing from another source (damages to 3rd party), and plagiarizing as pretending another’s ideas are your own (damages to self).

Most of us here probably agree that A is not really a problem, but where people stand on issue B is where the value divide comes from.


Good artists copy, great artists steal.

-- me


No, speak for yourselves.

I really don't know how I feel about that Ctrl/Control joke.

It’s classic Economist humor

Wow, I love the illustration!

We spend a lot of time talking about the fairness of how LLMs are trained but not enough time talking about the fact that mediocre people now have a faucet they can turn on to flood work and content into the world effortlessly at volume.

Something I found disappointing is discovering what plagiarists the ancient greats were. Take Paradise Lost for instance. The entire thing is unoriginal and fan fiction derivative work of the Bible (itself questionable)

Of Mans First Disobedience, and the Fruit Of that Forbidden Tree, whose mortal taste…

Ummm, excuse me. This is literally the garden of Eden. In fact this idiot plagiarizes the name too. He actually calls this Eden. wtf. Fake as fuck. And people call this copy-paste artist who cites literally zero of his sources a “poet”.


Honey there ain’t nothing new under the sun.

Umm excuse me. Are you going to use an LLM to plagiarize or are you going to cite that?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: