#10710 ppc64le builder restarting endlessly
Closed: Can't Fix 2 years ago by kevin. Opened 2 years ago by jjames.

  • Describe the issue
    This build has been running since yesterday: https://koji.fedoraproject.org/koji/buildinfo?buildID=1935481. All arches except ppc64le finished in a reasonable amount of time. The ppc64le build, however, appears to be restarting over and over. Can somebody please look at it and see what is going on?

  • When do you need this? (YYYY/MM/DD)
    No particular deadline, but it would nice if the build would finish.

  • When is this no longer needed or useful? (YYYY/MM/DD)
    Never.

  • If we cannot complete your request, what is the impact?
    I can't build Singular, which means I can't update sagemath.


It's hitting the oom killer. ;( (killing kojid)

Mar 18 17:07:04 buildvm-ppc64le-23.iad2.fedoraproject.org systemd-oomd[3860467]: Killed /system.slice/kojid.service due to memory used (20974206976) / total (21303853056) and swap used (10849484800) / total (11811028992) being more than 90.00%

last things I see in the build logs are:

><lib nets (d2t d2t_singular/nets_lib.doc==>d2t_singular/nets_lib.tex(catNets)(netBigIntMat)(netBigIn
tMatShort)(netCoefficientRing)(netIdeal)(netInt)(netBigInt)(netIntMat)(netIntMatShort)(netIntVector)(
netIntVectorShort)(netNumber)(netList)(netMap)(netMap2)(netmatrix)(netmatrixShort)(netPoly)(netPrimeP
ower)(netRing)(netString)(netvector)(netvectorShort)(stackNets)==>d2t_singular/nets_lib.tex)        
><lib phindex (d2t d2t_singular/phindex_lib.doc==>d2t_singular/phindex_lib.tex(signatureL)(signatureL
qf)(PH_ais)(PH_nais)==>d2t_singular/phindex_lib.tex)                                                
><lib polybori (d2t d2t_singular/polybori_lib.doc==>d2t_singular/polybori_lib_noEx.tex{boolean_std}{b
oolean_constant}{boolean_poly}{direct_boolean_poly}{recursive_boolean_poly}{boolean_ideal}{boolean_se
t}{from_boolean_constant}{from_boolean_poly}{direct_from_boolean_poly}{recursive_from_boolean_poly}{f
rom_boolean_ideal}{from_boolean_set}{bvar}{poly2zdd}{zdd2poly}{disp_zdd}==>d2t_singular/polybori_lib_
noEx.tex)                                                                                           
><lib sets (d2t d2t_singular/sets_lib.doc==>d2t_singular/sets_lib.tex(set)(union)(intersectionSet)(co
mplement)(isElement)(isSubset)(isSuperset)(addElement)==>d2t_singular/sets_lib.tex)                 
><lib autgradalg (d2t d2t_singular/autgradalg_lib.doc==>d2t_singular/autgradalg_lib.tex(autKS)(autGra
dAlg)(autGenWeights)(stabilizer)(autXhat)(autX)==>d2t_singular/autgradalg_lib.tex)                  
><lib difform (d2t d2t_singular/difform_lib.doc==>d2t_singular/difform_lib.tex(diffAlgebra)(diffAlgeb
raListGen)(difformFromPoly)(difformCoef)(difformHomogDecomp)(difformToString)(difformPrint)(difformIs
Gen)(difformAdd)(difformSub)(difformNeg)(difformMul)(difformDiv)(difformEqu)(difformNeq)(difformIsBig
ger)(difformIsSmaller)(difformDeg)(difformIsHomog)(difformIsHomogDeg)(difformListCont)(difformListSor
t)(difformUnivDer)(difformDiff)(derivationFromList)(derivationFromPoly)(derivationConstructor)(deriva
tionToString)(derivationPrint)(derivationAdd)(derivationSub)(derivationNeg)(derivationMul)(derivation
Equ)(derivationNeq)(derivationEval)(derivationContraction)(derivationLie)==>d2t_singular/difform_lib.
tex)

I guess this is in tests?

Perhaps @sharkcz could figure out more...

It isn't tests, sadly. This is part of building singular.idx, which some consumers need. I don't know how to make that step consume less memory, but I'll poke around in the code and see if anything jumps out at me.

Try reducing parallelism? E.g. by %constrain_build -c1.

The actual building of the ELF objects seems fine, so I think I will try adding -j1 to the documentation building make invocation. I'll kill this build and try again. Wish me luck.

The new task seems to be having the same issue. I would appreciate any other thoughts on how to proceed.

Metadata Update from @mohanboddu:
- Issue tagged with: medium-gain, medium-trouble, ops

2 years ago

I was able to figure out how to work without singular.idx for the ppc64le build. This will make the in-program help less usable (well, pretty much unusable), but then I expect the number of Singular users on ppc64le is approximately zero. The build has been done in Rawhide. If there is no obvious solution to this issue, go ahead and close this ticket. If there is, I prefer not to have architecture-specific bits in the spec file when it can be avoided. Thanks.

Yeah, I can't think of much... we may be able to increase memory on builders at some point when we get new hardware, but I have no idea when that might be.

Sorry for the hassle.

Metadata Update from @kevin:
- Issue close_status updated to: Can't Fix
- Issue status updated to: Closed (was: Open)

2 years ago

Login to comment on this ticket.

Metadata
Boards 1
Ops Status: Backlog