• 0 Posts
  • 47 Comments
Joined 1 year ago
cake
Cake day: June 4th, 2023

help-circle





  • As someone working and publishing in the field this is more a cyber jerk about American exceptionalism than actually true.

    Chinese universities and companies publish a shit tonne at pure machine learning conferences. They absolutely do a large amount of research into the fundamentals of machine learning as well as the applied stuff. They’re probably the closest to the US in terms of having large firms that are prepared to bank roll the training of the very large language models.

    Alibaba in particular has been constantly doing cutting edge stuff in terms of multimodal language models that are worth paying attention to.

    The actual truth is that China does both kinds of work. Broad foundational and applied work lead by independent research groups in companies and universities, and focused application driven stuff for direct application by the state.

    Google still stands out in terms of the amount of research it does, but this is because Google is different to everyone -other US research institutes don’t compare to it either.


  • (Swiss)Germans are completely mad about food.

    It’s their culture to complain about everything, except food. All they care about is that it’s as bland as possible and has big portions. If you manage that, they’ll give you five stars every time.

    I spent 3 years living in Germany, and not only can you not get anything spicy for love nor money, they also don’t use herbs. It just blows my mind. They’re physically so close to France and Italy, but the food is so far away.








  • OhNoMoreLemmy@lemmy.mltoScience Memes@mander.xyzSardonic Grin
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    2 months ago

    I’m not describing binary classification, I’m describing multiclass. “Group classification” isn’t really a thing. Yes, your ml system probably guesses what kind of plant it is and then looks up the ediblity of components.

    The problem with this is how they will handle rare plants that aren’t in the dataset, or that are in the dataset but with insufficient data to be recognised.

    Because multiclass assumes that it’s seen representative data on all possible outputs (e.g. plant types) it will tend to be dangerously confident on plant types it hasn’t seen before.

    This is because it can rule out other classes. E.g. if you’re trying to classify as rose, tulip, or daisy and you get a bramble, your classifier is likely to be very certain it’s a rose because tulips and daisies don’t have thorns. So your softmax score is likely to show heavy confidence in rose even though it’s actually none of them.

    This is exactly what can go wrong when you try to use the softmax/standard multiclass approach and come across an interesting rare mushroom or wild carrot. You don’t want it to guess which type of plant in the database it’s most like, even if this guess comes with scores, you want it to say that it genuinely doesn’t know and you shouldn’t eat it.


  • OhNoMoreLemmy@lemmy.mltoScience Memes@mander.xyzSardonic Grin
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    2
    ·
    2 months ago

    The key issue here is that ‘level of certainty’ doesn’t really mean what you would like it to.

    You get back a number yes, but it can change according to what’s visible in the background, the angle that the plants at, how close is it to the camera, and how nice the camera is you’re using (professional photographers use expensive cameras and take shots of different things to everyone else).

    Interpreting this score as “how safe is it to eat the plant” is a really bad idea. You will still eat the wrong plant. These scores can lead to very confident random guessing when you show it a plant it’s never seen before.

    And no, softmax is a trick for making the scores all sum to one, so you get back a confidence for every possible thing the image could be of.



  • OhNoMoreLemmy@lemmy.mltoScience Memes@mander.xyz✨️ Finish him. ✨️
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    1
    ·
    edit-2
    4 months ago

    It’s worth saying that ml is in a very different position to most of academic publishing.

    All of the serious journals are free to publish and fully open access and a significant amount of publication includes enough code that things are mostly replicable. GitHub has done wonders for our field. Also many tech companies use publications as an indication of prestige and go out of their way to publish stuff.

    We’re still drowning in too many papers and 95% of everything is shit, but that’s every field really. Talking to musk on twitter is the not right place for a nuanced discussion about publication.



  • It’s a consequence of parliamentary sovereignty.

    Parliament can always dissolve itself and call an election, and it’s an important mechanism for getting rid of the government.

    The problem is that the prime minister also has a majority in parliament, and that means he can make parliament dissolve itself when he likes.

    This was actually a problem for Johnson. Initially, he didn’t have enough of a majority and it wasn’t clear he could call an election without Corbyn’s support.