Fatal Exception: Bad Cast

Bug #1648178 reported by kaputtnik
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
widelands
Fix Released
Critical
Unassigned

Bug Description

With current trunk (bzr 8207) i encountered some crash's which i can't reproduce with specific steps. Now i got it with gdb:

Console output:

Fatal exception: std::bad_cast
Game: Writing Preload Data ...
Thread 1 "widelands" received signal SIGSEGV, Segmentation fault.
0x00000000010f0883 in Widelands::GamePreloadPacket::write (this=0x7fffffffa260, fs=..., game=...)
    at /home/kaputtnik/Quellcode/widelands-repo/trunk/src/game_io/game_preload_packet.cc:113
113 s.set_int("gametype", static_cast<int32_t>(game.game_controller()->get_game_type()));

gdb backtrace is attached.

I guess this is the same issue as described in bug 1648152, but since build 19 is used there, i decided to add a new bug.

Tags: crash savegame

Related branches

Revision history for this message
kaputtnik (franku) wrote :
Revision history for this message
GunChleoc (gunchleoc) wrote :

I am getting it in the following branch as well, triggers pretty often in the Economy tutorial when the tavern is burned down.

https://code.launchpad.net/~widelands-dev/widelands/bug-1426465-scenario-timings/+merge/311119

Changed in widelands:
status: New → Confirmed
milestone: none → build20-rc1
importance: Undecided → Critical
tags: added: crash savegame
Revision history for this message
TiborB (tiborb95) wrote :

Can somebody provide savegame - I would look at this but I dont want to spend so much time till I get to the point when it crashes.

Revision history for this message
kaputtnik (franku) wrote :

Tibor, the crash happens some times, and some times not. So a savegame isn't helpful imho. I attache one anyway.

Revision history for this message
TiborB (tiborb95) wrote :

I meant Gun could provide either savegame or steps where/how to trigger it..

Revision history for this message
kaputtnik (franku) wrote :

I think the only thing when this happens is the autosave. Maybe plus some things in game happens (like burning down a building) during, or close before, the autosave. But as said: It is not really reproducible right now (from my side).

Revision history for this message
TiborB (tiborb95) wrote :

From the backtrack it seems to me that a problem happens and just then emergency autosave is triggered and even this fails.

But I must admit that now when I test AI very extensively I have quite a lot of crashes - and logs (output to console) ends without any hint what is going on. So there is some problem somewhere...

Revision history for this message
GunChleoc (gunchleoc) wrote :

To reproduce, start the Economy Tutorial from the branch that I linked, then play until the Tavern burns down. That's the point where it tends to crash.

Revision history for this message
TiborB (tiborb95) wrote :

@Gun, after commenting out two try/catch instances I got this backtrace

#0 0x00007ffff513d04f in raise () from /usr/lib/libc.so.6
#1 0x00007ffff513e47a in abort () from /usr/lib/libc.so.6
#2 0x00007ffff5a544bd in __gnu_cxx::__verbose_terminate_handler () at /build/gcc/src/gcc/libstdc++-v3/libsupc++/vterminate.cc:95
#3 0x00007ffff5a52276 in __cxxabiv1::__terminate (handler=<optimized out>)
    at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:47
#4 0x00007ffff5a522c1 in std::terminate () at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:57
#5 0x00007ffff5a524d8 in __cxxabiv1::__cxa_throw (obj=0x9888e00, tinfo=0x18faad0 <typeinfo for std::bad_cast@@GLIBCXX_3.4>,
    dest=0x7ffff5a50750 <std::bad_cast::~bad_cast()>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:87
#6 0x00007ffff5a51142 in __cxxabiv1::__cxa_bad_cast () at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_aux_runtime.cc:34
#7 0x0000000000d5da01 in Widelands::Building::descr (this=0x9888cf0)
    at /var/widelands/BZR/bug-1426465-scenario-timings/src/logic/map_objects/tribes/building.h:198
#8 0x0000000000d87c2f in Widelands::Immovable::Immovable (this=0x9888cf0, imm_descr=..., former_building=0x9888cf0)
    at /var/widelands/BZR/bug-1426465-scenario-timings/src/logic/map_objects/immovable.cc:349
#9 0x0000000000d87b93 in Widelands::ImmovableDescr::create (this=0x881e0c0, egbase=..., coords=..., former_building=0x9888cf0)
    at /var/widelands/BZR/bug-1426465-scenario-timings/src/logic/map_objects/immovable.cc:333
#10 0x0000000000d250fb in Widelands::EditorGameBase::create_immovable (this=0x7fffffffca10, c=..., idx=11 '\v',
    type=Widelands::MapObjectDescr::OwnerType::kTribe, former_building=0x9888cf0)
    at /var/widelands/BZR/bug-1426465-scenario-timings/src/logic/editor_game_base.cc:335
#11 0x0000000000d2533a in Widelands::EditorGameBase::create_immovable (this=0x7fffffffca10, c=..., name="destroyed_building",
    type=Widelands::MapObjectDescr::OwnerType::kTribe, former_building=0x9888cf0)

Obviously, the former_building is involved here. Would you look at this?

Revision history for this message
GunChleoc (gunchleoc) wrote :

I only have 2 days for this, because I will be travelling, so I don't know if I can find the bug by then. Will try though.

Revision history for this message
TiborB (tiborb95) wrote :

Let me know then, I can try as well, though you are more familiar with the stuff.

Revision history for this message
GunChleoc (gunchleoc) wrote :

It would be brilliant if you could take this on, I haven't even started on this yet.

I expect that the offending commit is http://bazaar.launchpad.net/~widelands-dev/widelands/trunk/revision/8186

There is another commit from this week adjusting the packet number, but the crash is older than that, so it shouldn't be relevant.

http://bazaar.launchpad.net/~widelands-dev/widelands/trunk/revision/8210

As far as I can see, I initialized everything with nullptr, so I'm not sure where this is coming from.

Revision history for this message
TiborB (tiborb95) wrote :

OK, I will do my best :)
What is bit strange is the randomness...

Is the former_building stuff already in B19?

Revision history for this message
GunChleoc (gunchleoc) wrote :

No, it's new after B19. So, either the former_building isn't the culprit after all, or https://bugs.launchpad.net/widelands/+bug/1648152 is a separate bug.

Revision history for this message
TiborB (tiborb95) wrote :

I think the problem is somewhere here:

Immovable(const ImmovableDescr& imm_descr, const Widelands::Building* former_building)

with that Widelands::Building*, I am quite unsure about these memory issues, but it seems to me that passing a building here is not reliable. I would prefer something like this:

Immovable(const ImmovableDescr& imm_descr, const ImmovableDescr& former_imm)

Also I found also this in building.h:

 /**
  * The former buildings vector keeps track of all former buildings
  * that have been enhanced up to the current one. The current building
  * index will be in the last position. For construction sites, it is
  * empty except enhancements. For a dismantle site, the last item will
  * be the one being dismantled.
  */
 const FormerBuildings get_former_buildings() {
  return old_buildings_;

Can this be used as well?

Revision history for this message
TiborB (tiborb95) wrote :

old_buildings_ are probably not the way to go. This is just static list of buildings based on description...

Revision history for this message
TiborB (tiborb95) wrote :

Am I right that former_building is used only for burning sites?

Revision history for this message
TiborB (tiborb95) wrote :

I did small change to my branch:
http://bazaar.launchpad.net/~widelands-dev/widelands/ai-post-b19-2/view/head:/src/logic/map_objects/immovable.cc#L333
and it seems the crash is gone (and also the functionality of former_building)

Revision history for this message
GunChleoc (gunchleoc) wrote :

Yes, it is only used for burning sites.

I passed in a Building instead of a BuildingDescr because I needed the owner (Player) for some reason.

Revision history for this message
SirVer (sirver) wrote :

Maybe the building is already deleted at this point? We could double check by adding a log(this); in ~Building and a log(previous_building) in this area and see which shows up first in the console.

Revision history for this message
SirVer (sirver) wrote :

Alternatively we could pass former building in as a OPtr<> which would guarantee that it still lives at this point (or crash).

Revision history for this message
TiborB (tiborb95) wrote :

Would OPtr really guarantee that the object kept alive, or it would just safely returned nullptr if it is gone. The consequence would be that sometimes (randomly) the former building would be unknown.

@Gun - I noticed that the owner stuff in diff, but I wondered why it is needed now, but was not needed before...

Revision history for this message
SirVer (sirver) wrote :

> Would OPtr really guarantee that the object kept alive, or it would just safely returned nullptr if it is gone.

the latter. I do not suggest doing this as the fix, but to investigate if the deletion of the former building is indeed the problem - it should not give as many false negatives as a plain pointer.

Revision history for this message
TiborB (tiborb95) wrote :

I somehow understand what OPtr should be, I also grepped the code to better grasp it, then I spent some time reworking the code, but now I am stuck at this:

....immovable.cc:349:63: error: ‘struct Widelands::OPtr<Widelands::Building>’ has no member named ‘descr’
      former_building_descr_(former_building ? former_building.descr() : nullptr),

What do you think about it? Is there something like get_object()?

I can push the branch if you want...

Revision history for this message
GunChleoc (gunchleoc) wrote :

I remember changing some code around during code review, maybe I messed up something there.

During a running game, a Building* is passed to the constructor to get at the BuildingDescr for the former_building and the owner. So, the Immovable is storing the descr and the owner.

During savegame load, the Immovable is initialized with a nullptr and the former_building and the owner are set manually afterwards.

Revision history for this message
SirVer (sirver) wrote :

I just saw this bug the first time when running the test suite for nightly builds. It seems intermittent there too.

Revision history for this message
SirVer (sirver) wrote :

I did the following to figure this out:

- I added debug statements in create_immovable() and ~Building()
- pause the game.
- order a building to be burned down.
- unpause the game

This is the relevant log:

#sirver Destroying building this: 0x7fe9f35066b0
#sirver Created immovable: former_building: 0x7fe9f35066b0

So it is as I thought: the building gets destroyed before the immovable reads its data. This works "most of the time" since nobody overwrote the memory already, so it might still contain the data from the former_building. But something else might have gotten a hold of this memory already and overwrote it with something that has nothing to do with the building.

It even says so in the code in Building::destroy():

// We are deleted. Only use stack variables beyond this point
if (fire) {
  egbase.create_immovable(pos, "destroyed_building", MapObjectDescr::OwnerType::kTribe, this);
}

'this' is of course a heap variable - we should have seen this in code review. :(

Changed in widelands:
status: Confirmed → Triaged
Revision history for this message
TiborB (tiborb95) wrote :

So I see two primitive ways to fix:
- pass just a string containing name of previous building
- pass a desc of previous building - if this is possible

Revision history for this message
TiborB (tiborb95) wrote :

And one small comment - I had subjective feeling that probability of crash is higher if computer is more stressed in regard to RAM occupancy, this nicely fits to the character of bug...

Revision history for this message
SirVer (sirver) wrote :

I have a fix ready for review soonish.

SirVer (sirver)
Changed in widelands:
status: Triaged → In Progress
SirVer (sirver)
Changed in widelands:
status: In Progress → Fix Committed
Revision history for this message
GunChleoc (gunchleoc) wrote :

Fixed in build20-rc1

Changed in widelands:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.